Re: [R] how to compare two datasets in R>?

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 2:14 AM, song song wrote:


hi, everybody, my question is:

suppose I have two data sets, set A is large and have variables like  
ID,

Gender, Income.  Set B is small and suppose only has ID.

Now I want to get a subset from data set A which contains ID from  
Set B.


Try:

A[A$ID %in% B$ID, ]

Or:

subset(A, ID %in% B$ID)




How to do this in R>?  Is there any commands to do this?

Thank you

[[alternative HTML version deleted]]

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to make multiple curves in one plot

2010-11-02 Thread Joshua Wiley
Hi Karena,

As Remko mentioned, it really is polite to say what package your
function is from if its not one of the basics (such as base or stats).
 Also, it is nice to give us sample data (side note, both of these
things are mentioned in the posting guide which you may find here:
http://www.r-project.org/posting-guide.html).

In any case I was able to search and find what is most likyle the
function you are using.  Although it does not make particular sense,
here is an example plot with two sets of different lines added.  They
incorporate results from running(), although the general plot() and
lines() function do not really care where the x and y values came
from.

library(gtools)
set.seed(1212)
dat <- runif(100)
plot(dat, running(dat, width = 5, pad = TRUE))
lines(dat, running(dat, width=10, fun = median, pad = TRUE))
lines(dat, running(dat, width=10, fun = mean, pad = TRUE), col = "blue")

HTH,

Josh


On Mon, Nov 1, 2010 at 8:34 PM, karena  wrote:
>
> hello,
>
> plot(running(-log10(results_chr_p$empi_p), fun=mean, width=41, font.axis=4,
> by=1),type="l",cex=0.1, ylab="-Log(p)", ylim=c(0,5.0), xlab=" ", lwd=2)
>
>
> this is my code to make a plot. The problem is, now I want to add one more
> curve to the plot, which is for another variable in the data.frame
> -log10(results_chr_p$p). My question is: how to make multiple lines in one
> plot, especially when using the 'running' function in it??
>
> thank you very much,
>
> karena
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/how-to-make-multiple-curves-in-one-plot-tp3023135p3023135.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help on install.packages() function

2010-11-02 Thread Christofer Bogaso
Hi, when I need to download some package I use for example
install.packages("fBasics") function. However simply using that function
needs additional intervention to select the server from which I want to
download. I would like to ask the list how I can put additional argument in
the install.packahes() function to stop showing that pop-up?

While looking into the help page I guess "repos = getOption("repos")" is the
responsible one. However still in dark to understand how to change that
option.

Thanks,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on install.packages() function

2010-11-02 Thread Joshua Wiley
Hi Chistofer,

I selected a mirror in my .Rprofile file (which I think is fairly
common), so I do not have to set it every time I start R.  In any
case, you just have to specify the url of the mirror you want to the
CRAN repo.  Something like this ought to work:

r <- getOption("repos")
r["CRAN"] <- "your/favorite/repo"
options(repos = r)


Cheers,

Josh

On Tue, Nov 2, 2010 at 12:57 AM, Christofer Bogaso
 wrote:
> Hi, when I need to download some package I use for example
> install.packages("fBasics") function. However simply using that function
> needs additional intervention to select the server from which I want to
> download. I would like to ask the list how I can put additional argument in
> the install.packahes() function to stop showing that pop-up?
>
> While looking into the help page I guess "repos = getOption("repos")" is the
> responsible one. However still in dark to understand how to change that
> option.
>
> Thanks,
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get the scripname within the executed myscript.r

2010-11-02 Thread Žroutík
Dear R-users!

Is there a way to obtain the name of the executed myscript.r, i.e. when

cmd> rscript myscript.r

is executed? (The name of the script in this case is "myscript.r")

Here is the explanation of why I would like to get that: Why: I have
prepared a set of scripts (decompose_data.r, plot_with_higher_resolution.r,
plot_3D.r, etc.) placed in a root directory at one place. I keep updated
only one original of each. For each new dataset, I create a new directory in
my directory tree, and I copy r scripts with the same names containing a
line for executing the only one, e.g.

source(file.path("w:/data & fits/root4scripts", "decompose_data.r"))

This way I can execute old datasets with new/updated scripts automatically
(by executing all decompose_data.r in the directory tree), or by copying new
"links".r to the directory. A small inconvenience of this way of executing
is, that ones named, it is hard to change the name of the script. Therefore
I would like to introduce a function that reads the name of the scriptname
executed somewhere deep in my directory tree and executes the proper
script.r in the root. If anybody read till here, please: Is there a more
convenient/straightforward way how manage more scripts through a directory
tree with datasets and keeping them altogether updated?

Thanks for listening, Zroutik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to stop showing messages while loading package?

2010-11-02 Thread Christofer Bogaso
Thanks Erik for your reply. However I would also like to stop showing
messages while downloading ans installing package. For example I get
following messages while I download fBasics:

"trying URL '
http://cran.cnr.Berkeley.edu/bin/windows/contrib/2.10/fBasics_2110.79.zip'
Content type 'application/zip' length 1309928 bytes (1.2 Mb)
opened URL
downloaded 1.2 Mb
package 'fBasics' successfully unpacked and MD5 sums checked
The downloaded packages are in
C:\Documents and Settings\tcp28.TRANS\Local
Settings\Temp\Rtmpsj6utY\downloaded_packages"

Is there any way to stop showing this message?

Thanks,


On Tue, Nov 2, 2010 at 12:16 AM, Erik Iverson  wrote:

> Simply read the ?library help page, where you'll find under Details:
>
> To suppress messages during the loading of packages use
> ‘suppressPackageStartupMessages’: this will suppress all messages
> from R itself but not necessarily all those from package authors.
>
>
> Christofer Bogaso wrote:
>
>>  Hi, is there any way to stop showing all messages which sometimes come
>> while
>> loading a packages? For example let say I want to load fBasics package. So
>> I
>> get following notices:
>>
>> library(fBasics)
>>>
>> Loading required package: MASS
>> Loading required package: timeDate
>> Loading required package: timeSeries
>> Attaching package: 'timeSeries'
>> The following object(s) are masked from 'package:zoo':
>>time<-
>>
>> Attaching package: 'fBasics'
>> The following object(s) are masked from 'package:base':
>>norm
>> I want to stop all above information to be shown. Would really aprreciate
>> if
>> somebody points any.
>>
>> Thanks and regards,
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the scripname within the executed myscript.r

2010-11-02 Thread Ivan Calandra

Hi,

Not sure it would really help you since I haven't understood everything, 
but read

?list.files
the pattern argument is quite convenient too.

Ivan

Le 11/2/2010 10:47, Žroutík a écrit :

Dear R-users!

Is there a way to obtain the name of the executed myscript.r, i.e. when

cmd>  rscript myscript.r

is executed? (The name of the script in this case is "myscript.r")

Here is the explanation of why I would like to get that: Why: I have
prepared a set of scripts (decompose_data.r, plot_with_higher_resolution.r,
plot_3D.r, etc.) placed in a root directory at one place. I keep updated
only one original of each. For each new dataset, I create a new directory in
my directory tree, and I copy r scripts with the same names containing a
line for executing the only one, e.g.

source(file.path("w:/data&  fits/root4scripts", "decompose_data.r"))

This way I can execute old datasets with new/updated scripts automatically
(by executing all decompose_data.r in the directory tree), or by copying new
"links".r to the directory. A small inconvenience of this way of executing
is, that ones named, it is hard to change the name of the script. Therefore
I would like to introduce a function that reads the name of the scriptname
executed somewhere deep in my directory tree and executes the proper
script.r in the root. If anybody read till here, please: Is there a more
convenient/straightforward way how manage more scripts through a directory
tree with datasets and keeping them altogether updated?

Thanks for listening, Zroutik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace text at certain positions in a file

2010-11-02 Thread RINNER Heinrich
Hello,

thanks again for this nice solution using sub and regular expressions!
However, in real life I have to overwrite more than two positions with blanks 
(say 50 or so), so I was trying to modify this in the following way:

> s <- c("ab34cd78e", "fg3 hi78j")

# your suggestion (works perfectly for the simple case):
> sub("^(..)..(..)..", "\\1  \\2  ", s)
[1] "ab  cd  e" "fg  hi  j"

# gereralizing the pattern (works as well):
> sub("^(.{2}).{2}(.{2}).{2}", "\\1  \\2  ", s)
[1] "ab  cd  e" "fg  hi  j"

# generalizing the replacement (doesn't work):
>  sub("^(.{2}).{2}(.{2}).{2}", "\\1 {2}\\2 {2}", s)
[1] "ab {2}cd {2}e" "fg {2}hi {2}j"

Apparently, " {2}" ist not interpreted as a string with two blanks ("  ") in 
the replacement part, so something is wrong in my expression there. I just 
can't figure out what...

Kind regards
Heinrich.


> -Ursprüngliche Nachricht-
> Von: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Gesendet: Donnerstag, 28. Oktober 2010 13:40
> An: RINNER Heinrich
> Cc: r-h...@stat.math.ethz.ch
> Betreff: Re: [R] replace text at certain positions in a file
>
>
> On Thu, Oct 28, 2010 at 5:26 AM, RINNER Heinrich
>  wrote:
> > Hello,
> >
> > I am working with R version 2.10.1 under windows.
> > In a text file, I need to replace all characters at certain
> column positions with blanks.
> > For example, say the file contains two lines and looks like this:
> >
> > ab34cd78e
> > fg3 hi78j
> >
> > I'd like to replace everything at positions 3-4 and 7-8
> with blanks, so the output should be:
> >
> > ab  cd  e
> > fg  hi  j
> >
> > [I'm not sure if this is really an R question(?), solutions
> outside of R - maybe via shell() or so - are welcome!]
> >
>
> Try this:
>
> > s <- c("ab34cd78e", "fg3 hi78j")
> > sub("^(..)..(..)..", "\\1  \\2  ", s)
> [1] "ab  cd  e" "fg  hi  j"
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help mkin

2010-11-02 Thread rebekka baumgartner

Hi,
I am trying fit my kinetic data with the "mkin" package. There I have two 
questions:

1.) I would like to fix the parameter parent_0, the concentration of the parent 
at time 0. In the mkinfit function description it says you can do it with the 
arguments parms.ini and fixed_parms. In which form do these arguments have to 
be to fix the (internally defined and fitted) parameter parent_0?
SFO <- mkinmod(parent=list(type="SFO"))
SFO.fit <- mkinfit(SFO,data, parm.ini=??, fixed_parms=??)

2.) When I try to fit the parent and a metabolite m1, I get the error message:
Error in dnorm(x, mean, sd, log) : 
  Non-numeric argument to mathematical function
In addition: Warning messages:
1: In Ops.factor(ModVar, obsdat) : - not meaningful for factors
2: In Ops.factor(ModVar, obsdat) : - not meaningful for factors

My code is:
data <- read.table("101025_batch1.txt", header=TRUE)
SFO_SFO <- mkinmod(parent = list(type = "SFO", to = "m1", sink =TRUE),m1 = 
list(type = "SFO")) 
SFO.fit <- mkinfit(SFO_SFO,data)

101025_batch1.txt looks like this:
nametimevalue
parent  0   NA
parent  15  NA
parent  29  23.796
parent  44  14.499
parent  58  7.650
parent  73  4.170
parent  88  2.438
parent  102 1.865
parent  117 1.598
m1  0   0
m1  15  6.476
m1  29  19.651
m1  44  32.271
m1  58  40.906
m1  73  48.295
m1  88  53.295
m1  102 56.250
m1  117 58.459

Thank you very much for you help!
Rebekka
-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/chbrowser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help mkin

2010-11-02 Thread rebekka baumgartner
Sorry, I copied in the wrong error message, this is the right one:
Model cost at call  1 :  15535.3 
DLSODA-  Warning..Internal T (=R1) and H (=R2) are  
  such that in the machine, T + H = T on the next step  
 (H = step size). Solver will continue anyway.  
  In above,  R1 =  0.0D+00   R2 =  0.0D+00  
DINTDY-  T (=R1) illegal
  In above message,  R1 =  0.15000D+02  
  T not in interval TCUR - HU (= R1) to TCUR (=R2)  
  In above,  R1 =  0.0D+00   R2 =  0.0D+00  
DINTDY-  T (=R1) illegal
  In above message,  R1 =  0.29000D+02  
  T not in interval TCUR - HU (= R1) to TCUR (=R2)  
  In above,  R1 =  0.0D+00   R2 =  0.0D+00  
DLSODA-  Trouble in DINTDY.  ITASK = I1, TOUT = R1  
  In above message,  I1 = 1 
  In above message,  R1 =  0.29000D+02  
Error in lsoda(y, times, func, parms, ...) : 
  illegal input detected before taking any integration steps - see written 
message

 Original-Nachricht 
> Datum: Tue, 02 Nov 2010 11:44:32 +0100
> Von: "rebekka baumgartner" 
> An: r-help@r-project.org
> Betreff: [R] help mkin

> 
> Hi,
> I am trying fit my kinetic data with the "mkin" package. There I have two
> questions:
> 
> 1.) I would like to fix the parameter parent_0, the concentration of the
> parent at time 0. In the mkinfit function description it says you can do it
> with the arguments parms.ini and fixed_parms. In which form do these
> arguments have to be to fix the (internally defined and fitted) parameter
> parent_0?
> SFO <- mkinmod(parent=list(type="SFO"))
> SFO.fit <- mkinfit(SFO,data, parm.ini=??, fixed_parms=??)
> 
> 2.) When I try to fit the parent and a metabolite m1, I get the error
> message:
> Error in dnorm(x, mean, sd, log) : 
>   Non-numeric argument to mathematical function
> In addition: Warning messages:
> 1: In Ops.factor(ModVar, obsdat) : - not meaningful for factors
> 2: In Ops.factor(ModVar, obsdat) : - not meaningful for factors
> 
> My code is:
> data <- read.table("101025_batch1.txt", header=TRUE)
> SFO_SFO <- mkinmod(parent = list(type = "SFO", to = "m1", sink =TRUE),m1 =
> list(type = "SFO")) 
> SFO.fit <- mkinfit(SFO_SFO,data)
> 
> 101025_batch1.txt looks like this:
> name  timevalue
> parent0   NA
> parent15  NA
> parent29  23.796
> parent44  14.499
> parent58  7.650
> parent73  4.170
> parent88  2.438
> parent102 1.865
> parent117 1.598
> m10   0
> m115  6.476
> m129  19.651
> m144  32.271
> m158  40.906
> m173  48.295
> m188  53.295
> m1102 56.250
> m1117 58.459
> 
> Thank you very much for you help!
> Rebekka
> -- 
> Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
> jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/chbrowser
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message in fit.mult.impute (Hmisc package)

2010-11-02 Thread Frank Harrell

I tried your code with the rms package (replacement for the Design package;
see http://biostat.mc.vanderbilt.edu/Rrms) and it worked fine.

Note that multiple imputation needs the outcome variable in the imputation
model.

Frank


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-message-in-fit-mult-impute-Hmisc-package-tp3022817p3023563.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question in using nlme and lme4 for unbalanced data

2010-11-02 Thread Mike Marchywka









> Date: Mon, 1 Nov 2010 17:38:54 -0700
> From: djmu...@gmail.com
> To: cy...@email.arizona.edu
> CC: r-help@r-project.org
> Subject: Re: [R] question in using nlme and lme4 for unbalanced data
>
> Hi:
>
> On Mon, Nov 1, 2010 at 3:59 PM, Chi Yuan  wrote:
>
> > Hello:
> > I need some help about using mixed for model for unbalanced data. I
> > have an two factorial random block design. It's a ecology
[...]
>
> Unbalanced data is not a problem in either package. However, five blocks is
> rather at the boundary of whether or not one can compute reliable variance
> components and random effects. Given that the variance estimate of blocks in
> your models was nearly zero, you're probably better off treating them as
> fixed rather than random and analyzing the data with a fixed effects model
> instead.
>
> Another question is about p values.
> > I kind of heard the P value does not matter that much in the mixed
> > model because it's not calculate properly.
>
>
> No. p-values are not calculated in lme4 (as I understand it) because,
> especially in the case of severely unbalanced data, the true sampling
> distributions of the test statistics in small to moderate samples are not
> necessarily close to the asymptotic distributions used to compute the
> corresponding p-values. It's the (sometimes gross) disparity between the
> small-sample and asymptotic distributions that makes the reported p-values
> based on the latter unreliable, not an inability to calculate the p-value
> properly. I can assure you that Prof. Bates knows how to compute a p-value.

To add my own question on terminology[ even the statements here should be taken
as questions ], assuming the null hypothesis is 
true and you have some underlying population distribution of various 
attirubtes, 
you get some distribution for your test statistic for repeated experiemnts. The 
asymptotic distribution I take it is the true population distribution which may 
not be well reflected
in your ( small ) sample? Usually people justify non-parametrics by saying they
help in the small sample/outlier cases. Alternatively, if you have some 
reasonable
basis for knowing the true population distributions, you could use that for p 
value
calculation and/or do monte carlo and just measure the number of time you 
incorrectly
reject null hypothesis etc. Of course, monte carlo code needs to be debugged 
too so
nothing will be a sure thing. Introducing new things like an indpendently known
population distribution may not be statitically rigorous by some criteria( 
comments welcome LOL) but you free to examine it for analysis.


>
> Is there any other way I can
> > tell whether the treatment has a effect not? I know AIC is for model
> > comparison,

Get more data? In this case,it would seem the goal of statistical analysis
is to make some guesses about causality. Presumably this is one piece of
evidence in a larger "case" that includes theory and other observations.
To paraphrase the legal popular legal phrase, 
"if the model doesn't fit you must not quit." Or, as someone at the US FDA
is quoted as saying, " A p-value is no substitute for a brain." 



> > do I report this in formal publication?

I guess that depends on the journal (LOL). Personally I'd be more worried about
getting a convincing story together than playing to a specific audience. 
However,
many questions of detail do relate to the audience and journal- you want to
use the math to determine reality, what you present depends on the publication. 
There is nothing wrong with presenting novel analyses with enough detail to
the right audience but it may not be for everyone :)

> >
>
> As mentioned above, I would suggest analyzing this as a fixed effects
> problem. Since the imbalance is not too bad, and it is not unusual in field
> experiments to have more control EUs than treatment EUs within each level of
> treatment, a fixed effects analysis may be sufficient. It wouldn't hurt to
> consult with a local statistician to discuss the options.

  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace text at certain positions in a file

2010-11-02 Thread Gabor Grothendieck
On Tue, Nov 2, 2010 at 6:20 AM, RINNER Heinrich
 wrote:
> Hello,
>
> thanks again for this nice solution using sub and regular expressions!
> However, in real life I have to overwrite more than two positions with blanks 
> (say 50 or so), so I was trying to modify this in the following way:
>
>> s <- c("ab34cd78e", "fg3 hi78j")
>
> # your suggestion (works perfectly for the simple case):
>> sub("^(..)..(..)..", "\\1  \\2  ", s)
> [1] "ab  cd  e" "fg  hi  j"
>
> # gereralizing the pattern (works as well):
>> sub("^(.{2}).{2}(.{2}).{2}", "\\1  \\2  ", s)
> [1] "ab  cd  e" "fg  hi  j"
>
> # generalizing the replacement (doesn't work):
>>  sub("^(.{2}).{2}(.{2}).{2}", "\\1 {2}\\2 {2}", s)
> [1] "ab {2}cd {2}e" "fg {2}hi {2}j"
>
> Apparently, " {2}" ist not interpreted as a string with two blanks ("  ") in 
> the replacement part, so something is wrong in my expression there. I just 
> can't figure out what...
>

If its always the same pattern repeated over and over then this works:

   gsub("(..)..", "\\1  ", s)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spatial plots maps-ssplot

2010-11-02 Thread pelt

Hi all,

I have made a plot with ssplot, using a SpatialPointsDataFrame. The 
content is quite simple, as I have 9 grid points with lon/lat 
coordinates and 9 values attached to these coordinates. They are in a 
square area of 3 by 3 gridboxes.
I would like to lay a map from maps() over these values, but when I try 
this, the grids of the maps (smaller) do not overlap with the grids I 
have already created with spplot(). I would like shaded gridboxes, the 
boxes should be filled with colour. Can anyone help me with this 
problem? Thank you in advance!

This is part of my code:

minlon<-4.5
maxlon<-12
minlat<-46.5
maxlat<-52.5

gt=GridTopology(cellcentre.offset=c(5.75,47.5),cellsize=c(2.5,2),cells.dim=c(3,3))
grd=SpatialGrid(gt,proj4string = CRS(as.character(NA)))
gridparameters(grd)
gd<-as.data.frame(grd,"SpatialGrid")
h<-season_djf
h<-as.data.frame(h)
djf.att=SpatialPointsDataFrame(gd,h)
gridded(djf.att)=TRUE

spplot(djf.att, col.regions = colorRampPalette(c("red","orange", 
"yellow", "lightblue", "blue",
"purple")), xlab="Longitude(°)", 
ylab="Latitude(°)",add=T,scales=list(draw=T),main="Seasonal change DJF 
MIUB", font.main=4)


map('rivers', xlim=c(minlon,maxlon),ylim= c(minlat,maxlat), 
col="blue",add=T)
map('worldHires', xlim = c(minlon, maxlon), ylim = c(minlat, maxlat), 
add = T, col =

"darkgrey",wrap=T)

Regards,
Saskia van Pelt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R script on linux?

2010-11-02 Thread gokhanocakoglu

Dear all, 
I am conducting a simulation study for my thesis on R in windows xp platform
but due to better performance and for gaining time I have to run the script
on linux server(some kind of workstation). But I have no idea which format
the codes should be in. Is there any converter for this situation- is it
possible convert the R code in linux format or can you advice me a reference
for overcoming this problem..

kindest regards

Gokhan
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023650.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script on linux?

2010-11-02 Thread Duncan Murdoch

On 02/11/2010 8:58 AM, gokhanocakoglu wrote:

Dear all,
I am conducting a simulation study for my thesis on R in windows xp platform
but due to better performance and for gaining time I have to run the script
on linux server(some kind of workstation). But I have no idea which format
the codes should be in. Is there any converter for this situation- is it
possible convert the R code in linux format or can you advice me a reference
for overcoming this problem..


What is the problem?

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script on linux?

2010-11-02 Thread gokhanocakoglu

I can't run the script the program doesn't work...
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script on linux?

2010-11-02 Thread Jonathan P Daily
What is the error message?
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
 - Jubal Early, Firefly



From:
gokhanocakoglu 
To:
r-help@r-project.org
Date:
11/02/2010 09:11 AM
Subject:
Re: [R] R  script on linux?
Sent by:
r-help-boun...@r-project.org




I can't run the script the program doesn't work...
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about ggplot2

2010-11-02 Thread Shige Song
Dear All,

I am trying to graph a simple scatter plot where the x axis is year
and the y axis is a percentage (percentage of infant death). Instead
of plotting the raw data, I want to plot summary statistics such as
mean and median. Here is the problem: the value range of y is between
0 and 1, but since infant death is a rare event, the mean and median
is very low (something like 5%), which shows up as a horizontal line
at the bottom of the figure. My question is: how do I change the scale
of the y-axis so that it does not have the range between 0 and 1 but
between 0 and 0.1? Many thanks.

By the way, I am using ggplot2, and here is my code:

---
year.plot <- ggplot(d, aes(year, rate))
year.plot + stat_summary(fun.y = "mean", geom = "line")
---

Best,
Shige

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Setting the names of a data.frame

2010-11-02 Thread Santosh Srinivas
I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame .. pls
advise. 

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar", 
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L, 
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B", 
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120", 
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L, 
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L, 
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"), 
V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class = "data.frame")


dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller", 
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010", 
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale", 
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000", 
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("", 
"Holding after Transaction"), class = "factor"), V6 = structure(NA_integer_,
.Label = "1000", class = "factor"), 
V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf hanging on macintosh - works on windows

2010-11-02 Thread GL

Marc: Installing Simon's package worked perfectly. Thanks so much! 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/sqldf-hanging-on-macintosh-works-on-windows-tp3022193p3023736.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread Gaj Vidmar
Though  in this list, in Excel it's just (literally!) five clicks 
away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so desired)

Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended mainly 
in statistics,
I did two years of geology, so (and also because of knowing what the 
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè"  wrote in message 
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...
> Hi,
>
> I am sorry, will try to be more exact from now on...
>
> I have a data.frame  with a field called Opis. IT contains sentenses that 
> I would like to split in words or fields in data.frame...when I say 
> columns I mean as in Excel table. I would like to split "Opis" into ten 
> fields from the first ten words in Opis field.
> Here is an example of my data.frame.
>
> 'data.frame':   22928 obs. of  12 variables:
> $ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
> $ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
> $ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
> $ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
> $ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN MELJAST 
> PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884 9123 2500 
> 4756 ...
> $ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..: 154 125 
> 101 101 NA 106 125 80 106 101 ...
> $ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
> $ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
> $ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53 53 56 
> 53 53 53 53 53 ...
> $ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
> $ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
> $ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1 1 1 26 
> 1 1 1 1 1 ...
>
> Hope that explains better...
> Thank you, m
>
> -Original Message-
> From: David Winsemius [mailto:dwinsem...@comcast.net]
> Sent: Monday, November 01, 2010 10:13 PM
> To: Matev¾ Pavliè
> Cc: r-help@r-project.org
> Subject: Re: [R] spliting first 10 words in a string
>
>
> On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:
>
>> Hi all,
>>
>>
>>
>> I have a columnn with text that has quite a few words in it. I would
>> like to split these words in separate columns, but just first ten
>> words in the string. Is that possible in R?
>>
>>
>
> Not sure what a column means to you. It's not a precisely defined R
> type or class. (And you are requested to offered a concrete example
> rather than making us guess.)
>
> >words <-"I have a columnn with text that has quite a few words in
> it. I would like to split these words in separate columns, but just
> first ten words in the string. Is that possible in R?"
>
> > strsplit(words, " ")[[1]][1:10]
>  [1] "I"   "have""a"   "columnn" "with""text"
> "that""has" "quite"   "a"
>
>
> Or if in a dataframe:
>
> > words <-c("I have a columnn with text that has quite a few words in
> it.",   "I would like to split these words in separate columns", "but
> just first ten words in the string. Is that possible in R?")
> > worddf <- data.frame(words=words)
>
> > t(sapply(strsplit(worddf$words, " "), "[", 1:10) )
>  [,1]  [,2][,3][,4]  [,5][,6][,7][,
> 8]  [,9]   [,10]
> [1,] "I"   "have"  "a" "columnn" "with"  "text"  "that"  "has"
> "quite""a"
> [2,] "I"   "would" "like"  "to"  "split" "these" "words" "in"
> "separate" "columns"
> [3,] "but" "just"  "first" "ten" "words" "in""the"   "string."
> "Is"   "that"
>
>
> -- 
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] density() function: differences with S-PLUS

2010-11-02 Thread Nicola Sturaro Sommacal (Quantide srl)
Hello!

Someone know what are the difference between R and S-PLUS in the density()
function?

For example, I would like to reply this simple S-PLUS code in R, but I don't
understand which parameter I should modify to get the same results.

S-PLUS CODE:
density(1:1000, width = 4)

R-CODE:
density(1:1000, bw = 4, window = "g",  n = 50, cut = 0.75)

I obtain the same x values, but different y values. I try also different
examples, with different parameter.

Can you help me?

Thank you in advance.

Nicola Sturaro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get the scripname within the executed myscript.r

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 5:47 AM, Žroutík wrote:


Dear R-users!

Is there a way to obtain the name of the executed myscript.r, i.e.  
when


cmd> rscript myscript.r

is executed? (The name of the script in this case is "myscript.r")


commandArgs()



Here is the explanation of why I would like to get that: Why: I have
prepared a set of scripts (decompose_data.r,  
plot_with_higher_resolution.r,
plot_3D.r, etc.) placed in a root directory at one place. I keep  
updated
only one original of each. For each new dataset, I create a new  
directory in
my directory tree, and I copy r scripts with the same names  
containing a

line for executing the only one, e.g.

source(file.path("w:/data & fits/root4scripts", "decompose_data.r"))

This way I can execute old datasets with new/updated scripts  
automatically
(by executing all decompose_data.r in the directory tree), or by  
copying new
"links".r to the directory. A small inconvenience of this way of  
executing
is, that ones named, it is hard to change the name of the script.  
Therefore
I would like to introduce a function that reads the name of the  
scriptname

executed somewhere deep in my directory tree and executes the proper
script.r in the root. If anybody read till here, please: Is there a  
more
convenient/straightforward way how manage more scripts through a  
directory

tree with datasets and keeping them altogether updated?

Thanks for listening, Zroutik

[[alternative HTML version deleted]]


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] individual intercept and slope

2010-11-02 Thread Rosario Garcia Gil
Hello

I would like to extract the estimates for the intercept and slope by individual 
for growth from a lm fit.
Any advice?

Individual Time point  Height
1   1   10
1   2   11
1   3   23
1   4   15
1   5   21
1   6   23
2   1   24
2   2   12
2   3   9
2   4   10
2   5   11
2   6   10
...

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] system() and system2() functions

2010-11-02 Thread Ralph Olsson
Hello,

I help to maintain a moderate library of R code. In this code we have a number 
of calls to the system function along the lines of:

exe_output = system("./executable.exe",intern=T)

We tend to prefer system() over shell() because, provided the executable has 
been compiled and the working directory set, the command works under both linux 
and windows.

We've never had a problem with this code using R 2.9 and lower, but I've 
recently started testing code in R 2.12 and have been getting "CreateProcess 
failed to run..." error messages.

I've not found much info on this in the change logs/release notes, but from 
what 
I have found I am under the impression that system() no longer "shell quotes" 
the command passed to it (if I "shQuote()" the command the code runs fine). I 
also see from the help files that a new function "system2()" has been 
introduced 
which takes a different set of arguments and appears to be under development 
(from the help page: "system2is the beginnings of a more portable interface 
than 
system").

Since I assume there to be good reasons for this change to system I'm happy to 
spend the time updating our library to work under R 2.12, but before I commence 
on this task I wanted to try to get a better understanding of what changes have 
been made to system().

My questions are:

1) What is the nature of and motivation for the changes to the system() 
function?

2) What does system2() offer that system does not?

3) Can anyone recommend the "best" (in particular most future-proof) way of 
updating our system calls, preferably, and this may be a big ask, such that 
they 
work in both R 2.9 and R2.12 under both linux and windows?

If any of these questions have previously been answered and I've simply failed 
in my googling, links would be appreciated.

Many thanks for your time,

Ralph
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message in fit.mult.impute (Hmisc package)

2010-11-02 Thread Kim Fernandes
Thank you! That has fixed the problem.

Kim

On Tue, Nov 2, 2010 at 7:42 AM, Frank Harrell wrote:

>
> I tried your code with the rms package (replacement for the Design package;
> see http://biostat.mc.vanderbilt.edu/Rrms) and it worked fine.
>
> Note that multiple imputation needs the outcome variable in the imputation
> model.
>
> Frank
>
>
> -
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Error-message-in-fit-mult-impute-Hmisc-package-tp3022817p3023563.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] individual intercept and slope

2010-11-02 Thread Dimitris Rizopoulos

Have a look at function(s) lmList() from packages lme4 or nlme.

I hope it helps.

Best,
Dimitris


On 11/2/2010 3:14 PM, Rosario Garcia Gil wrote:

Hello

I would like to extract the estimates for the intercept and slope by individual 
for growth from a lm fit.
Any advice?

Individual Time point  Height
1   1   10
1   2   11
1   3   23
1   4   15
1   5   21
1   6   23
2   1   24
2   2   12
2   3   9
2   4   10
2   5   11
2   6   10
...

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about ggplot2

2010-11-02 Thread Abhijit Dasgupta
from where you are, 

year.plot+ylim(0,0.1)

Abhijit

On Nov 2, 2010, at 9:57 AM, Shige Song wrote:

> Dear All,
> 
> I am trying to graph a simple scatter plot where the x axis is year
> and the y axis is a percentage (percentage of infant death). Instead
> of plotting the raw data, I want to plot summary statistics such as
> mean and median. Here is the problem: the value range of y is between
> 0 and 1, but since infant death is a rare event, the mean and median
> is very low (something like 5%), which shows up as a horizontal line
> at the bottom of the figure. My question is: how do I change the scale
> of the y-axis so that it does not have the range between 0 and 1 but
> between 0 and 0.1? Many thanks.
> 
> By the way, I am using ggplot2, and here is my code:
> 
> ---
> year.plot <- ggplot(d, aes(year, rate))
> year.plot + stat_summary(fun.y = "mean", geom = "line")
> ---
> 
> Best,
> Shige
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about ggplot2

2010-11-02 Thread Joshua Wiley
Dear Shige,

You can use scale_y_continuous() to achieve this.

year.plot <- ggplot(d, aes(year, rate))
year.plot + stat_summary(fun.y = "mean", geom = "line") +
  scale_y_continuous(limits = c(0, .1))

where limits may be whatever you like for the y axis.

Cheers,

Josh

On Tue, Nov 2, 2010 at 6:57 AM, Shige Song  wrote:
> Dear All,
>
> I am trying to graph a simple scatter plot where the x axis is year
> and the y axis is a percentage (percentage of infant death). Instead
> of plotting the raw data, I want to plot summary statistics such as
> mean and median. Here is the problem: the value range of y is between
> 0 and 1, but since infant death is a rare event, the mean and median
> is very low (something like 5%), which shows up as a horizontal line
> at the bottom of the figure. My question is: how do I change the scale
> of the y-axis so that it does not have the range between 0 and 1 but
> between 0 and 0.1? Many thanks.
>
> By the way, I am using ggplot2, and here is my code:
>
> ---
> year.plot <- ggplot(d, aes(year, rate))
> year.plot + stat_summary(fun.y = "mean", geom = "line")
> ---
>
> Best,
> Shige
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R for Production - Discussion

2010-11-02 Thread Douglas Bates
On Mon, Nov 1, 2010 at 11:04 PM, Santosh Srinivas
 wrote:
> Hello Group,
>
> This is an open-ended question.
>
> Quite fascinated by the things I can do and the control I have on my
> activities since I started using R.
> I basically have been using this for analytical related work off my desktop.
> My experience has been quite good and most issues where I need to
> investigate and solve are typical items more related to data errors, format
> corruption, etc... not necessarily "R" Related.
>
> Complementing this with Python gives enough firepower to do lots of
> production (analytical related activities) on the cloud (from my research I
> see that every innovative technology provider seems to support Python ...
> google, amazon, etc).
>
> Question on using R for Production activities:
> Q1) Does anyone have experience of using R-scripts etc ... for production
> related activities. E.g. serving off a computational/ analytical /
> simulation environment from a webportal with the analytical processing done
> in R.
> I've seen that most useful things for normal (not rocket science) business
> (80-20 rule) can be done just as well in R in comparison with tools like
> SAS, Matlab, etc.
>
> Q2) I haven't tried the processing routines for much larger data-sets
> assuming "size" is not a constraint nowadays.
> I know that I should try out ... but any forewarnings would help. Is it
> likely that something that works for my "desktop" dataset is quite as likely
> to work when scaled up to a "cloud dataset"?
> Assuming that I do the clearing out of unused objects, not running into
> infinite loops, etc?
>
> i.e. is there any problem with the "fundamental architecture of R itself"?
> (like press articles often say)
>
>
> Q3) There are big fans of the SAS, Matlab, Mathworks environments out there
>  does anyone have a comparison of how R fares.
> >From my experience R is quite neat and low level ... so overheads should be
> quite low.
> Most slowness comes due to lack of knowledge (see my code ... like using the
> wrong structures, functions, loops, etc.) rather than something wrong with
> the way R itself is.
> Perhaps there is no "commercial" focus to enhance performance related issues
> but my guess is that it is just matter of time till the community evolves
> the language to score higher on that too.
> And perhaps develops documentation to assist the challenge users with
> "performance tips" (the ten commandments types)
>
> Q4) You must have heard about the latest comment from James Goodnight of SAS
> ... "We haven't noticed that a lot. Most of our companies need industrial
> strength software that has been tested, put through every possible scenario
> or failure to make sure everything works correctly."
> My "gut" is that random passionate geeks (playing part-time) do better
> testing than a military of professionals ... (but I've no empirical evidence
> here)
>
> I am not taking a side here (although I appreciate those who do!) .. but
> looking for an objective reasoning.

Regarding performance and size of data sets I would suggest viewing
the presentation that Dirk Eddelbuettel and Romain Francois gave at
Google recently.  David Smith links to it in his blog at
blog.revolutionanalytics.com

One of the advantages of Open Source systems is that people can
provide many different kinds of hooks into the code.

At present any R vector objects use 32-bit signed integers for
indexing, which limits the size of an individual vector to 2^{31}-1.
There are some methods available for using external storage to by-pass
this but they do introduce another level of complexity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

Though  in this list, in Excel it's just (literally!)  
five clicks

away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so  
desired)


You could do the same thing without needing to leave R. Just  
read.table( textConnection(..), header=FALSE, fill=TRUE)


> read.table(textConnection(words), fill=T)
   V1V2V3  V4V5V6V7  V8   V9  
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.

2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat  
possiblein  R?




Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended  
mainly

in statistics,
I did two years of geology, so (and also because of knowing what the
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè"  wrote in message
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...

Hi,

I am sorry, will try to be more exact from now on...

I have a data.frame  with a field called Opis. IT contains  
sentenses that

I would like to split in words or fields in data.frame...when I say
columns I mean as in Excel table. I would like to split "Opis" into  
ten

fields from the first ten words in Opis field.
Here is an example of my data.frame.

'data.frame':   22928 obs. of  12 variables:
$ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
$ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
$ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN MELJAST
PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884  
9123 2500

4756 ...
$ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:  
154 125

101 101 NA 106 125 80 106 101 ...
$ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53  
53 56

53 53 53 53 53 ...
$ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
$ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
$ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1  
1 1 26

1 1 1 1 1 ...

Hope that explains better...
Thank you, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, November 01, 2010 10:13 PM
To: Matev¾ Pavliè
Cc: r-help@r-project.org
Subject: Re: [R] spliting first 10 words in a string


On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:


Hi all,



I have a columnn with text that has quite a few words in it. I would
like to split these words in separate columns, but just first ten
words in the string. Is that possible in R?




Not sure what a column means to you. It's not a precisely defined R
type or class. (And you are requested to offered a concrete example
rather than making us guess.)


words <-"I have a columnn with text that has quite a few words in

it. I would like to split these words in separate columns, but just
first ten words in the string. Is that possible in R?"


strsplit(words, " ")[[1]][1:10]

[1] "I"   "have""a"   "columnn" "with""text"
"that""has" "quite"   "a"


Or if in a dataframe:


words <-c("I have a columnn with text that has quite a few words in

it.",   "I would like to split these words in separate columns", "but
just first ten words in the string. Is that possible in R?")

worddf <- data.frame(words=words)



t(sapply(strsplit(worddf$words, " "), "[", 1:10) )

[,1]  [,2][,3][,4]  [,5][,6][,7][,
8]  [,9]   [,10]
[1,] "I"   "have"  "a" "columnn" "with"  "text"  "that"  "has"
"quite""a"
[2,] "I"   "would" "like"  "to"  "split" "these" "words" "in"
"separate" "columns"
[3,] "but" "just"  "first" "ten" "words" "in""the"
"string."

"Is"   "that"


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing li

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra

Hi,

The problem is that all your columns of sHeaders are factors. It might 
be better to set stringsAsFactors to FALSE when you build it.


Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
 names(tData)[i] <- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData) <- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame .. pls
advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
 V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class = "data.frame")


dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 = structure(NA_integer_,
.Label = "1000", class = "factor"),
 V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using data( ) in a loop

2010-11-02 Thread McCarthy, Ian
I'm trying to generate 50+ graphs using the UScensus2000tract data.  I
need to access the data for just about all of the states, so I was
hoping to create a simple loop that will take the relevant state from my
data and load the associated census data from the UScensus2000tract
package.   Below is a sample of what I'm trying to do.  Any suggestions
are much appreciated.

 

stores=read.table(paste(path,"\\Store
List.txt",sep=""),header=TRUE,sep="\t")

city=stores$City

state=stores$State

city.state=data.frame(city,state)

 

state.temp=city.state$state[1]

tract <- paste(state.temp,".tract",sep="")

data(tract)

Warning message:

In data(tract) : data set 'tract' not found

 

 

 

Ian McCarthy, Ph.D.

F T I 

214.397.1761 direct

214.663.1683 mobile

ian.mccar...@fticonsulting.com
 

 

 

Confidentiality Notice:\ \ This email and any attachment...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting the names of a data.frame

2010-11-02 Thread Santosh Srinivas
Thanks. 

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might 
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
  names(tData)[i] <- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData) <- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :
> I have tData as below. I need to set the names with the headers from the
> first row in sHeaders
> Sorry .. forgot how to set the names from row in another data frame .. pls
> advise.
>
> names(tData) = sHeaders[1,] does not work correctly
>
> Also, why doesn't drop.levels(sHeaders) not work?
>
> dput(tData)
> structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
> Kumar",
> "Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
> 3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
> ), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
> "S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
> c("2120",
> "4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
> 2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
> 3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
>  V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
> "factor")), .Names = c("V1",
> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =
"data.frame")
>
>
> dput(sHeaders)
> structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
> "Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
> c("05/10/2010",
> "%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
> c("Buy /Sale",
> "Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
> "%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
> .Label = c("",
> "Holding after Transaction"), class = "factor"), V6 =
structure(NA_integer_,
> .Label = "1000", class = "factor"),
>  V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
> c("V1",
> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")
>
>
> Thanks very  much.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] individual intercept and slope

2010-11-02 Thread Phil Spector
You didn't say what form you wanted the output in, but 
here's one way:



sapply(split(dat,dat$individual),function(s)lm(height~time,data=s)$coef)

   1 2
(Intercept) 8.47 19.87
time2.485714 -2.057143

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 2 Nov 2010, Rosario Garcia Gil wrote:


Hello

I would like to extract the estimates for the intercept and slope by individual 
for growth from a lm fit.
Any advice?

Individual Time point  Height
1   1   10
1   2   11
1   3   23
1   4   15
1   5   21
1   6   23
2   1   24
2   2   12
2   3   9
2   4   10
2   5   11
2   6   10
...

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset a data.frame

2010-11-02 Thread Simone Gabbriellini
Hello List,

this should be simple, but cannot figure it out. I am trying to subset a 
data.frame like this:

> data4
   userstime
1  user52009-12-01 14:09:58
2  user12009-12-01 14:40:16
3  user82009-12-04 08:18:37
4  user62009-12-04 08:18:37
5 user832009-12-04 08:18:37
6 user822009-12-04 08:18:37
7 user312009-12-04 08:18:37
8 user852009-12-04 08:18:37
9 user332009-12-04 08:18:37
10 user22010-01-05 07:18:36

I would like to subset it and retain, let's say, only the data with time < 
'2010-01-05 07:18:36', but I have no idea about the sintax to do that. 

is something like this close to the correct way:

active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36", origin="1970-01-01 
00:00:00-00")]

thanks in advance for any help.

best regards,
Simone Gabbriellini
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using data( ) in a loop

2010-11-02 Thread Dennis Murphy
Hi:

On Tue, Nov 2, 2010 at 8:06 AM, McCarthy, Ian <
ian.mccar...@fticonsulting.com> wrote:

> I'm trying to generate 50+ graphs using the UScensus2000tract data.  I
> need to access the data for just about all of the states, so I was
> hoping to create a simple loop that will take the relevant state from my
> data and load the associated census data from the UScensus2000tract
> package.   Below is a sample of what I'm trying to do.  Any suggestions
> are much appreciated.
>
>
> stores=read.table(paste(path,"\\StoreList.txt",sep=""),header=TRUE,sep="\t")
> city=stores$City
> state=stores$State
> city.state=data.frame(city,state)
>
> Or more succinctly,
city.state <- stores[ , c('city', 'state')]

>
>
> state.temp=city.state$state[1]
> tract <- paste(state.temp,".tract",sep="")
>

I believe you need get() here, but I would think you'd need a path to the
state file you want to grab. See ?get

HTH,
Dennis

>
> data(tract)
>
> Warning message:
>
> In data(tract) : data set 'tract' not found
>
>
>
>
>
>
>
> Ian McCarthy, Ph.D.
>
> F T I
>
> 214.397.1761 direct
>
> 214.663.1683 mobile
>
> ian.mccar...@fticonsulting.com
> 
>
>
>
>
>
> Confidentiality Notice:\ \ This email and any attachment...{{dropped:16}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset a data.frame

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 11:53 AM, Simone Gabbriellini wrote:


Hello List,

this should be simple, but cannot figure it out. I am trying to  
subset a data.frame like this:



data4

  userstime
1  user52009-12-01 14:09:58
2  user12009-12-01 14:40:16
3  user82009-12-04 08:18:37
4  user62009-12-04 08:18:37
5 user832009-12-04 08:18:37
6 user822009-12-04 08:18:37
7 user312009-12-04 08:18:37
8 user852009-12-04 08:18:37
9 user332009-12-04 08:18:37
10 user22010-01-05 07:18:36

I would like to subset it and retain, let's say, only the data with  
time < '2010-01-05 07:18:36', but I have no idea about the sintax to  
do that.


is something like this close to the correct way:

active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00")]


Close. Try:

 active <- data4[data4$time <= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00") , ]


Or:

active <- subset(data4, time <= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00") )







--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Add more functions or dataset in my package

2010-11-02 Thread Carla Moreira
Hello,

I have constructed an R package, however, now I need to add a  dataset in
the package. How can I do it?
Thank you very much in advance.
-- 
Carla Moreira

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strings from different locale

2010-11-02 Thread Phil Spector

Steven -
   Does typing

Sys.setlocale('LC_ALL','C')

before the offending command suppress the message?

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Mon, 1 Nov 2010, steven mosher wrote:


I'm doing some test processing of a cvs file that appears to use a different
locale
from my machine.

I get the following warning:

input string 1 is invalid in this locale

My locale is US. Is this simply a matter of changing my locale to 'all;
locales?

I don't know what locale the string is in, is there a way to detect this or
translate

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset a data.frame

2010-11-02 Thread Simone Gabbriellini
many thanks, works perfectly!

best,
Simone

Il giorno 02/nov/2010, alle ore 17.17, David Winsemius ha scritto:

> 
> On Nov 2, 2010, at 11:53 AM, Simone Gabbriellini wrote:
> 
>> Hello List,
>> 
>> this should be simple, but cannot figure it out. I am trying to subset a 
>> data.frame like this:
>> 
>>> data4
>>  userstime
>> 1  user5 2009-12-01 14:09:58
>> 2  user1 2009-12-01 14:40:16
>> 3  user8 2009-12-04 08:18:37
>> 4  user6 2009-12-04 08:18:37
>> 5 user83 2009-12-04 08:18:37
>> 6 user82 2009-12-04 08:18:37
>> 7 user31 2009-12-04 08:18:37
>> 8 user85 2009-12-04 08:18:37
>> 9 user33 2009-12-04 08:18:37
>> 10 user2 2010-01-05 07:18:36
>> 
>> I would like to subset it and retain, let's say, only the data with time < 
>> '2010-01-05 07:18:36', but I have no idea about the sintax to do that.
>> 
>> is something like this close to the correct way:
>> 
>> active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36", origin="1970-01-01 
>> 00:00:00-00")]
> 
> Close. Try:
> 
> active <- data4[data4$time <= as.POSIXct("2010-01-05 07:18:36", 
> origin="1970-01-01 00:00:00-00") , ]
> 
> Or:
> 
> active <- subset(data4, time <= as.POSIXct("2010-01-05 07:18:36", 
> origin="1970-01-01 00:00:00-00") )
> 
>> 
> 
> 
> --
> 
> David Winsemius, MD
> West Hartford, CT
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add more functions or dataset in my package

2010-11-02 Thread Uwe Ligges

See the manual "Writing R Extensions".

Uwe Ligges


On 02.11.2010 17:13, Carla Moreira wrote:

Hello,

I have constructed an R package, however, now I need to add a  dataset in
the package. How can I do it?
Thank you very much in advance.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] system() and system2() functions

2010-11-02 Thread Uwe Ligges



On 02.11.2010 15:16, Ralph Olsson wrote:

Hello,

I help to maintain a moderate library of R code. In this code we have a number
of calls to the system function along the lines of:

 exe_output = system("./executable.exe",intern=T)

We tend to prefer system() over shell() because, provided the executable has
been compiled and the working directory set, the command works under both linux
and windows.

We've never had a problem with this code using R 2.9 and lower, but I've
recently started testing code in R 2.12 and have been getting "CreateProcess
failed to run..." error messages.

I've not found much info on this in the change logs/release notes, but from what
I have found I am under the impression that system() no longer "shell quotes"
the command passed to it (if I "shQuote()" the command the code runs fine). I
also see from the help files that a new function "system2()" has been introduced
which takes a different set of arguments and appears to be under development
(from the help page: "system2is the beginnings of a more portable interface than
system").

Since I assume there to be good reasons for this change to system I'm happy to
spend the time updating our library to work under R 2.12, but before I commence
on this task I wanted to try to get a better understanding of what changes have
been made to system().

My questions are:

1) What is the nature of and motivation for the changes to the system()
function?


Many, one of them is that system() had different behaviour under Linux 
vs. Windows.





2) What does system2() offer that system does not?


Portability.



3) Can anyone recommend the "best" (in particular most future-proof) way of
updating our system calls, preferably, and this may be a big ask, such that they
work in both R 2.9 and R2.12 under both linux and windows?


If it should work for R < 2.12.0, then use system() and add, at least 
for Windows, a shell command (such as "cmd") that allows the executable 
to run under the Windows command shell. Or better, use shell() right 
away, you need to special case for Windows anyway.


Best,
Uwe Ligges



If any of these questions have previously been answered and I've simply failed
in my googling, links would be appreciated.

Many thanks for your time,

Ralph
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra

Wait wait,

If sHeaders is actually the first line of tData, the question is how do 
you create/read this dataset in R? Isn't read from a text/csv file? In 
that case, set the "header" argument to TRUE. If not, there are probably 
better ways to do it, better than what you did (i.e. extract the first 
line and reuse it).


In any case, that would be easier then (though still not the best way):
names(tData) <- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :

Thanks.

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
   names(tData)[i]<- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame .. pls
advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
  V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =

"data.frame")


dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 =

structure(NA_integer_,

.Label = "1000", class = "factor"),
  V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting the names of a data.frame

2010-11-02 Thread Santosh Srinivas
It is just read from a file that has introductory text in the beginning and
a the header starts slight below  so couldn’t use header as such.
I just modified that dataset to ignore the earlier lines ... sHeaders =
tData[4,] &  tData = tData [5:end]

The original data was actually a readHTMLtable from a webpage.

Your solutions works well enough for my purpose ... thanks.


-Original Message-
From: Ivan Calandra [mailto:ivan.calan...@uni-hamburg.de] 
Sent: 02 November 2010 22:15
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Wait wait,

If sHeaders is actually the first line of tData, the question is how do 
you create/read this dataset in R? Isn't read from a text/csv file? In 
that case, set the "header" argument to TRUE. If not, there are probably 
better ways to do it, better than what you did (i.e. extract the first 
line and reuse it).

In any case, that would be easier then (though still not the best way):
names(tData) <- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :
> Thanks.
>
> Actually the sHeaders was a line in tData itself ...
> I just did sHeaders = tData [1,]
>
> How can I can build it without factors like your first suggestions?
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Ivan Calandra
> Sent: 02 November 2010 20:22
> To: r-help@r-project.org
> Subject: Re: [R] Setting the names of a data.frame
>
> Hi,
>
> The problem is that all your columns of sHeaders are factors. It might
> be better to set stringsAsFactors to FALSE when you build it.
>
> Or you can do it with a for loop like this:
> for (i in 1:length(sHeaders)){
>names(tData)[i]<- as.character(sHeaders[1,i])
> }
>
> Or with lapply:
> names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))
>
> HTH,
> Ivan
>
> Le 11/2/2010 14:58, Santosh Srinivas a écrit :
>> I have tData as below. I need to set the names with the headers from the
>> first row in sHeaders
>> Sorry .. forgot how to set the names from row in another data frame ..
pls
>> advise.
>>
>> names(tData) = sHeaders[1,] does not work correctly
>>
>> Also, why doesn't drop.levels(sHeaders) not work?
>>
>> dput(tData)
>> structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
>> Kumar",
>> "Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
>> 3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
>> ), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
>> "S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
>> c("2120",
>> "4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
>> 2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
>> 3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
>>   V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
>> "factor")), .Names = c("V1",
>> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =
> "data.frame")
>>
>> dput(sHeaders)
>> structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
>> "Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
>> c("05/10/2010",
>> "%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
>> c("Buy /Sale",
>> "Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
>> "%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
>> .Label = c("",
>> "Holding after Transaction"), class = "factor"), V6 =
> structure(NA_integer_,
>> .Label = "1000", class = "factor"),
>>   V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names
=
>> c("V1",
>> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class =
"data.frame")
>>
>>
>> Thanks very  much.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R for Production - Discussion

2010-11-02 Thread Saeed Abu Nimeh
I worked on a project where we used a random forest classifier to
predict a binary response. We trained a model in the ec2 cloud with 3
million observations and 44 features. We stored the model that was
generated by R using save(mymodel,file="model.Rdata"). Now we use
model.Rdata locally to predict new observations.
In our local system, we built a parser in Perl to generate the csv
representation of the observation we want to predict, then we used
RSPerl to communicate between Perl and R. But there is a catch,
instead of loading the random forest model (model.Rdata) every time we
want to predict a new observation, we have an R console running as a
daemon with the model.Rdata loaded already. Then, we send the
observation to be predicted from Perl to R. If anyone else has better
solutions/ideas, please feel free to share.
Thanks,
Saeed

On Mon, Nov 1, 2010 at 9:04 PM, Santosh Srinivas
 wrote:
> Hello Group,
>
> This is an open-ended question.
>
> Quite fascinated by the things I can do and the control I have on my
> activities since I started using R.
> I basically have been using this for analytical related work off my desktop.
> My experience has been quite good and most issues where I need to
> investigate and solve are typical items more related to data errors, format
> corruption, etc... not necessarily "R" Related.
>
> Complementing this with Python gives enough firepower to do lots of
> production (analytical related activities) on the cloud (from my research I
> see that every innovative technology provider seems to support Python ...
> google, amazon, etc).
>
> Question on using R for Production activities:
> Q1) Does anyone have experience of using R-scripts etc ... for production
> related activities. E.g. serving off a computational/ analytical /
> simulation environment from a webportal with the analytical processing done
> in R.
> I've seen that most useful things for normal (not rocket science) business
> (80-20 rule) can be done just as well in R in comparison with tools like
> SAS, Matlab, etc.
>
> Q2) I haven't tried the processing routines for much larger data-sets
> assuming "size" is not a constraint nowadays.
> I know that I should try out ... but any forewarnings would help. Is it
> likely that something that works for my "desktop" dataset is quite as likely
> to work when scaled up to a "cloud dataset"?
> Assuming that I do the clearing out of unused objects, not running into
> infinite loops, etc?
>
> i.e. is there any problem with the "fundamental architecture of R itself"?
> (like press articles often say)
>
>
> Q3) There are big fans of the SAS, Matlab, Mathworks environments out there
>  does anyone have a comparison of how R fares.
> >From my experience R is quite neat and low level ... so overheads should be
> quite low.
> Most slowness comes due to lack of knowledge (see my code ... like using the
> wrong structures, functions, loops, etc.) rather than something wrong with
> the way R itself is.
> Perhaps there is no "commercial" focus to enhance performance related issues
> but my guess is that it is just matter of time till the community evolves
> the language to score higher on that too.
> And perhaps develops documentation to assist the challenge users with
> "performance tips" (the ten commandments types)
>
> Q4) You must have heard about the latest comment from James Goodnight of SAS
> ... "We haven't noticed that a lot. Most of our companies need industrial
> strength software that has been tested, put through every possible scenario
> or failure to make sure everything works correctly."
> My "gut" is that random passionate geeks (playing part-time) do better
> testing than a military of professionals ... (but I've no empirical evidence
> here)
>
> I am not taking a side here (although I appreciate those who do!) .. but
> looking for an objective reasoning.
>
> Thanks,
> S
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density() function: differences with S-PLUS

2010-11-02 Thread Joshua Wiley
Dear Nicola,

There are undoubtedly people here who are familiar with both S+ and R,
but they may not always be around or get to every question.  In that
case there are (at least) two good options for you:

1) Say what you want mathematically (something of a universal
language) or statistically

2) Rather than just give us S+ code, show sample data (e.g., 1:1000),
and the values you would like obtained (in this case whatever the
output from S+ was).  This would let us *try* to figure out what
happened and duplicate it in R.

>From the arcane step of reading R's documentation for density (?density):

width: this exists for compatibility with S; if given, and ‘bw’ is
  not, will set ‘bw’ to ‘width’ if this is a character string,
  or to a kernel-dependent multiple of ‘width’ if this is
  numeric.

Which makes me wonder if this works for you (in R)?

density(1:1000, width = 4)


Cheers,

Josh


On Tue, Nov 2, 2010 at 3:04 AM, Nicola Sturaro Sommacal (Quantide srl)
 wrote:
> Hello!
>
> Someone know what are the difference between R and S-PLUS in the density()
> function?
>
> For example, I would like to reply this simple S-PLUS code in R, but I don't
> understand which parameter I should modify to get the same results.
>
> S-PLUS CODE:
> density(1:1000, width = 4)
>
> R-CODE:
> density(1:1000, bw = 4, window = "g",  n = 50, cut = 0.75)
>
> I obtain the same x values, but different y values. I try also different
> examples, with different parameter.
>
> Can you help me?
>
> Thank you in advance.
>
> Nicola Sturaro
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra

There is also a "header" argument in readHTMLtable()
About the file itself, can't you just erase the introductory text? There 
is also a skip argument to read.table() that might help you.
It's fine if my solution works, but I think it's still safer/easier to 
import the file directly with the correct headers

Ivan



Le 11/2/2010 17:51, Santosh Srinivas a écrit :

It is just read from a file that has introductory text in the beginning and
a the header starts slight below  so couldn’t use header as such.
I just modified that dataset to ignore the earlier lines ... sHeaders =
tData[4,]&   tData = tData [5:end]

The original data was actually a readHTMLtable from a webpage.

Your solutions works well enough for my purpose ... thanks.


-Original Message-
From: Ivan Calandra [mailto:ivan.calan...@uni-hamburg.de]
Sent: 02 November 2010 22:15
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Wait wait,

If sHeaders is actually the first line of tData, the question is how do
you create/read this dataset in R? Isn't read from a text/csv file? In
that case, set the "header" argument to TRUE. If not, there are probably
better ways to do it, better than what you did (i.e. extract the first
line and reuse it).

In any case, that would be easier then (though still not the best way):
names(tData)<- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :

Thanks.

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]

On

Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
names(tData)[i]<- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame ..

pls

advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
   V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =

"data.frame")

dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 =

structure(NA_integer_,

.Label = "1000", class = "factor"),
   V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names

=

c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class =

"data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling
Inspired by colouring the dots of box-whisker plots I am trying to also 
fill the boxes (rectangles) with different colours. This seems not to 
work as I expected.


Looking at the help page of panel.bwplot it says: 'fill - color to fill 
the boxplot'. Obviously it is only intended to fill all boxes with only 
one colour?


Nevertheless the following example shows, that 'fill' from panel.bwplot 
is able to work with more than one colour. But this only works with one 
colour or multiples of 5 colours:



-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
  panel = function(...) {
panel.bwplot(col=c("yellow"),
 fill=c("yellow"), ...)
  })

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do 
NOT work",

  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(col=c("yellow","blue","green"),
 fill=c("yellow","blue","green"), ...)
  })

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do work",
  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(col=c("yellow","blue","green","pink","red"),

fill=c("yellow","blue","green","pink","red"), ...)
   })

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-


Is there any chance to use more than one filling colour correctly?

Thanks in advance,
Rainer Hurling

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help me about Monte Carlo Permutation

2010-11-02 Thread Łukasz Ręcławowicz
2010/11/2 Chitra 

>
>  yes
>
> >
> >
> > Me too. So you want to do a MC test for Pearson's product-moment
> > correlation, right...?
>
>

So for sample sizes from 3 to about 10 we can use all permutations
[permn(combinat)]- test will be exact! (In our case 7!=5040)

lg<-"lightgreen"
g<-"green"
dg<-"darkgreen"
plot((gamma(1:31)),t="p",main="Suggested tests for r",ylab="Number of
permutations",xlab="n",lwd=2,col=c(rep(lg,10),rep(g,4),rep(dg,17)),log="yx",pch="-",cex=2)
legend(1,range(gamma(4:31))[2],c("exact","MC","cor.test"),col=c(lg,g,dg),pch="-",pt.cex=2)
abline(h=.Machine$integer.max,col=2,lty=3)

We use MC when the number of permutations is very large and we cannot use
them all. Beside, the difference between theoretical distribution for larger
samples >25 will be negligible.
Let's use your data:
> data
Qtot Itot
1 73 684
2 64 451
3 71 378
4 65 284
5 47 179
6 31 117
7 19 69

We get 0.01494540 from cor.test

> cor.test(data[,1],data[,2])

Let's write a function for our test, it might be something like:

cor.test.mc<-function(x,y,n=1e3){
our.data<-cbind(x,y)
if (!is.numeric(our.data[,1]) || !is.numeric(our.data[,2]))
stop("Only numeric variables are allowed.")
l<-length(our.data[,1])
if (l < 3)
stop("At least 3 samples are required.")
DNAME <- paste(deparse(substitute(x)), "and" ,deparse(substitute(y)))
samples<-unique(t(replicate(n,(sample(our.data[,1])
loop<-dim(samples)[1]
correlations<-rep(NA,loop)
for(i in 1:loop){
correlations[i]<-cor(our.data[,2],samples[i,])
}
observed<-cor(our.data[,1],our.data[,2])
GE<-sum(correlations>=observed)
LT<-sum(correlations<(-observed))
two.tailed.p<-(GE+LT)/loop
rea<-(loop/gamma(l+1))*100
RVAL <- list(statistic = c(r = observed), p.value = two.tailed.p, method =
"Monte Carlo Pearson's r test" ,
data.name = DNAME,samples=c(" Number of used unique
permutations"=loop),total=c("Percent of all possible
permutations"=round(rea,2)))
class(RVAL) <- "htest"
#But what kind of plot you wish to have - I don't know...
#hist(correlations,col="blue",xlab="r",xlim=c(-1,1),breaks=50)
return(RVAL)
gc()
}
cor.test.mc(data[,1],data[,2])
test<-cor.test.mc(data[,1],data[,2],6e4)
test
test$samples
test$total
#

And that's it. Our p-value is sum of 7/5040 (GE) and 61/5040 (LT).
You may also take a look @ library(MChtest).
Hope this helps!


-- 
Mi³ego dnia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cov.mve error

2010-11-02 Thread Marino Taussig De Bodonia, Agnese
Hello,

I am trying to use the cov.mve function on a set of variables to check for 
outliers, before I perform PCA on them. I am using the code that I found on 
"Everitt (2005) An R ans S-Plus companion to multivariate analysis" but its 
doesn't seem to work. I wrote:

at.central<-central[,7:17]   # 7:17 are the 10 variables that I want to 
screen for outliers
at.central.mve<-cov.mve(central, cor=T)

I also tried what the help file says to do:

at.central.mve<-cov.rob(central, cor=T, method="mve")

Both give me the error:

Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm,  :  
missing values and NaN's not allowed if 'na.rm' is FALSE
In addition: Warning message: In quantile(as.numeric(x), c(0.25, 0.75), na.rm = 
na.rm, names = FALSE) : NAs introduced by coercion

My dataset has no NAs, so what does this mean?? Something to do with the 
"quantile.used=" argument?

Thanks in advance for your time,

Agnese
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Relsurv package

2010-11-02 Thread Laurence Lauvier

Hello, 
I have a question about relsurv package particularly rsadd function: 
Rsadd(Surv(time,cens)~sex+ratetable(age=age*365.24,sex=sex,year=year),data=,=ratetable=,int=5,method=”max.lik”).
 
In the tutorial, it is indicated that "the age and year must be given in the
date format, i.e. in number of days since 01.01.1960". Nethertheless, in
Pohar’s article,
http://ibmi.mf.uni-lj.si/ibmi/biostat-center/predtiski/CMPB_Pohar_Stare_relsurv.pdf,
there is no indication about that. What is the true way to use this
function. 
Thanks for your help, 

Laurence 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Relsurv-package-tp3023956p3023956.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] One question on heatmap

2010-11-02 Thread Hua
Dear R-helper:

Suppose we have a matrix:

Genesample1 sample2

Gcnt112.52.8
Max   8.800039.1
Tmem176b 67.9000   304.7
Shmt2 8.600042.4
Rtn4 11.500057.7
Il17re7.600038.8
Bclp2 6.200032.1
Mobkl34.400032.2
Akr1b10   3.400030.1
Atp6ap2   5.400048.2
Snx2  5.700063.1
Tmem176a  7.600091.4
Klhl9 1.700030.3
Fbxo271.28.9
Scd1 34.6000 0.7
Tspan9   35.8000 4.2
2210016L21Rik39.1000 4.9
Ctnnb1  212.100033.1
Apoe397.200074.2
H2-DMb1  72.300014.1
Ryk  31.7000 6.4
Dapk285.400017.3
Gzmm179.400036.8
Actb  12993.4000  2678.1
Faim3   758.   157.6
Aktip   209.400046.0
Tbrg193.300021.3

When I try to make heatmap based on this gene expression value table, I found 
that, when I set 'scale' to 'column', the heatmap will be always be red. I 
think this is because, there's very large values in the matrix (gene Actb), 
while the most are just very small. Thus, the color will be very ugly. I just 
wonder, how to set the color to make the heatmap look better?  I have tried 
log-tranformation on the matrix and it's better now. But I do want to know if 
you have better ways to set the color span manually and make the heatmap look 
better without any log-transformation? 

Thanks in advance!

Best, Hua 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] One question on heatmap

2010-11-02 Thread Peter Langfelder
Before plotting a heatmap we usually standardize all genes to mean
zero and variance 1. That way the green/red represent under/over
expression with respect to the mean expression, which is roughly what
the original 2-color arrays (that literally produced such heatmaps)
were measuring. Of course, standardization assumes you have more than
2 samples.

If your expression matrix is stored in the variable expr, with columns
corresponding to samples and rows to genes, you can obtain a
standardized expression matrix as

stdExpr = t(scale(t(expr)))


Peter


On Tue, Nov 2, 2010 at 8:50 AM, Hua  wrote:
> Dear R-helper:
>
> Suppose we have a matrix:
>
> Gene            sample1 sample2
>
> Gcnt1            12.    52.8
> Max               8.8000    39.1
> Tmem176b         67.9000   304.7
> Shmt2             8.6000    42.4
> Rtn4             11.5000    57.7
> Il17re            7.6000    38.8
> Bclp2             6.2000    32.1
> Mobkl3            4.4000    32.2
> Akr1b10           3.4000    30.1
> Atp6ap2           5.4000    48.2
> Snx2              5.7000    63.1
> Tmem176a          7.6000    91.4
> Klhl9             1.7000    30.3
> Fbxo27            1.    28.9
> Scd1             34.6000     0.7
> Tspan9           35.8000     4.2
> 2210016L21Rik    39.1000     4.9
> Ctnnb1          212.1000    33.1
> Apoe            397.2000    74.2
> H2-DMb1          72.3000    14.1
> Ryk              31.7000     6.4
> Dapk2            85.4000    17.3
> Gzmm            179.4000    36.8
> Actb          12993.4000  2678.1
> Faim3           758.   157.6
> Aktip           209.4000    46.0
> Tbrg1            93.3000    21.3
>
> When I try to make heatmap based on this gene expression value table, I found 
> that, when I set 'scale' to 'column', the heatmap will be always be red. I 
> think this is because, there's very large values in the matrix (gene Actb), 
> while the most are just very small. Thus, the color will be very ugly. I just 
> wonder, how to set the color to make the heatmap look better?  I have tried 
> log-tranformation on the matrix and it's better now. But I do want to know if 
> you have better ways to set the color span manually and make the heatmap look 
> better without any log-transformation?
>
> Thanks in advance!
>
> Best, Hua
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:

Inspired by colouring the dots of box-whisker plots I am trying to  
also fill the boxes (rectangles) with different colours. This seems  
not to work as I expected.


Looking at the help page of panel.bwplot it says: 'fill - color to  
fill the boxplot'. Obviously it is only intended to fill all boxes  
with only one colour?


Nevertheless the following example shows, that 'fill' from  
panel.bwplot is able to work with more than one colour. But this  
only works with one colour or multiples of 5 colours:



-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color  
works",

 panel = function(...) {
   panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
 })

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors  
do NOT work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)
   panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
 })

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors  
do work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)

panel.bwplot(col=c("yellow","blue","green","pink","red"),

fill=c("yellow","blue","green","pink","red"), ...)
  })

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-


Is there any chance to use more than one filling colour correctly?



You have eight boxes to fill and 8 dots to color. You can either  
supply 8 distinct colors or you can supply some lesser number and they  
will be recycled across the entire 8 boxes and dots. What you cannot  
do ( and expect to see the dots against the fill background) is plot  
the dots as the same colors as the fill.


This will let you see all colors of dots and fill with only 4 colors  
because I set it up so there was no two identical colors in teh  
sequence of dots and fill during hte reculing:


bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do  
work",

  panel = function(...) {
panel.grid(v = -1, h = 0)
 
panel.bwplot(col=rev(c("yellow","blue","green","pink")),

 fill=c("yellow","blue","green","pink"), ...)
   })
 bp3




Thanks in advance,
Rainer Hurling

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] timeSequence

2010-11-02 Thread Carla Leal Kaymalyz
Hello,
I have time series data whose days are not consecutive. I used timeSequence
(from = "4/1/2010", to = "31/12/2010", format = "% Y-% m-% d", FinCenter =
"GMT") to generate a vector of consecutive days , however I need not
consider Sunday as well join this vector to a data frame containing
incomplete dates, for example

DaysX1X2
day 1   10 20
day 23050
day 440 45
day 545 35
day 620 10

Then the above I add the day 3, so I look like this:

DaysX1X2
day 1   10 20
day 23050
day 3NA   NA
day 440 45
day 545 35
day 620 10

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relsurv package

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 12:29 PM, Laurence Lauvier wrote:



Hello,
I have a question about relsurv package particularly rsadd function:
Rsadd(Surv(time,cens)~sex 
+ 
ratetable 
(age 
= 
age 
*365.24,sex=sex,year=year),data=,=ratetable=,int=5,method=”max.lik”).
In the tutorial, it is indicated that "the age and year must be  
given in the
date format, i.e. in number of days since 01.01.1960".  
Nethertheless, in

Pohar’s article,
http://ibmi.mf.uni-lj.si/ibmi/biostat-center/predtiski/CMPB_Pohar_Stare_relsurv.pdf 
,

there is no indication about that. What is the true way to use this
function.
Thanks for your help,


I seem to remember an almost identical question on rhelp from some  
months ago. (I remember because I looked at the article and the  
package documentation at the time.) Have you contacted the authors at  
any point?


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script on linux?

2010-11-02 Thread Thomas Levine
Open a terminal, then run these two commands.

cd /home/the/directory/with/your/script
R

Then run this in R

source('yourscript.R')

Tom

2010/11/2 Jonathan P Daily :
> What is the error message?
> --
> Jonathan P. Daily
> Technician - USGS Leetown Science Center
> 11649 Leetown Road
> Kearneysville WV, 25430
> (304) 724-4480
> "Is the room still a room when its empty? Does the room,
>  the thing itself have purpose? Or do we, what's the word... imbue it."
>     - Jubal Early, Firefly
>
>
>
> From:
> gokhanocakoglu 
> To:
> r-help@r-project.org
> Date:
> 11/02/2010 09:11 AM
> Subject:
> Re: [R] R  script on linux?
> Sent by:
> r-help-boun...@r-project.org
>
>
>
>
> I can't run the script the program doesn't work...
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:


Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this only
works with one colour or multiples of 5 colours:


-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do
NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?




Thanks for answering.


You have eight boxes to fill and 8 dots to color. You can either supply
8 distinct colors or you can supply some lesser number and they will be
recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the dots as
the same colors as the fill.


It was not my intention to get the dots coloured in the same colour as 
the boxes. Instead I am looking for a method to fill the boxes with a 
predefined set of different colours (from a color vector). As far as I 
can see this is only possible for one colour and multitudes of five colours.


The dots should remain uncoloured ...


This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh sequence
of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})


In your example you can see that the dots colors are painted in the 
right (reversed) order, the boxes are painted as sequence 
c("yellow","pink","green","blue") instead of 
c("yellow","blue","green","pink").


I do not understand how to turn over a given order and with a given 
count of colours to the boxes.




Thanks in advance,
Rainer Hurling


David Winsemius, MD
West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script on linux?

2010-11-02 Thread Jonathan P Daily
Alternatively, you can simply prefix all scripts with

#! /path/to/R/Rscript
...

where the path is usually /usr/bin/
This info is in the manual that comes packaged with R under, conveniently, 
the scripting section. I assumed that he was getting some error message.
Likely, the script was not created executable, in which case the terminal 
command chmod +x "myscript.R" will do it.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
 - Jubal Early, Firefly



From:
Thomas Levine 
To:
Jonathan P Daily 
Cc:
gokhanocakoglu , r-help@r-project.org, 
r-help-boun...@r-project.org
Date:
11/02/2010 02:24 PM
Subject:
Re: [R] R script on linux?



Open a terminal, then run these two commands.

cd /home/the/directory/with/your/script
R

Then run this in R

source('yourscript.R')

Tom

2010/11/2 Jonathan P Daily :
> What is the error message?
> --
> Jonathan P. Daily
> Technician - USGS Leetown Science Center
> 11649 Leetown Road
> Kearneysville WV, 25430
> (304) 724-4480
> "Is the room still a room when its empty? Does the room,
>  the thing itself have purpose? Or do we, what's the word... imbue it."
> - Jubal Early, Firefly
>
>
>
> From:
> gokhanocakoglu 
> To:
> r-help@r-project.org
> Date:
> 11/02/2010 09:11 AM
> Subject:
> Re: [R] R  script on linux?
> Sent by:
> r-help-boun...@r-project.org
>
>
>
>
> I can't run the script the program doesn't work...
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread Matevž Pavlič
Hi all,  

Thanks for all the help. I managed to do it with what Gaj suggested (Excel :(). 

The last solution from David is also freat i just don't undestand why R  put 
the words in 14 columns and thre rows? I would like it to put just the first 10 
words in source field to 10 diefferent destiantion fields, but the same row. 
And so on...is that possible?

Thank you, m
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David Winsemius
Sent: Tuesday, November 02, 2010 3:47 PM
To: Gaj Vidmar
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

> Though  in this list, in Excel it's just (literally!)  
> five clicks
> away!
> (with the column in question selected)
> Data -> Text to Columns -> Delimited -> tick Space -> Finish
> Pa je! (~Voila in Slovenian)
> (then import back to R, keeping only the first 10 columns if so  
> desired)

You could do the same thing without needing to leave R. Just  
read.table( textConnection(..), header=FALSE, fill=TRUE)

 > read.table(textConnection(words), fill=T)
V1V2V3  V4V5V6V7  V8   V9  
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.
2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat  
possiblein  R?

>
> Regards,
> Assist. Prof. Gaj Vidmar, PhD
> University Rehabilitattion Institute, Republic of Slovenia
>
> Irrelevant P.S. Long ago, before embarking on what eventually ended  
> mainly
> in statistics,
> I did two years of geology, so (and also because of knowing what the
> poster's institute does)
> I even kinda imagine what these data are.
>
> "Matev¾ Pavliè"  wrote in message
> news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...
>> Hi,
>>
>> I am sorry, will try to be more exact from now on...
>>
>> I have a data.frame  with a field called Opis. IT contains  
>> sentenses that
>> I would like to split in words or fields in data.frame...when I say
>> columns I mean as in Excel table. I would like to split "Opis" into  
>> ten
>> fields from the first ten words in Opis field.
>> Here is an example of my data.frame.
>>
>> 'data.frame':   22928 obs. of  12 variables:
>> $ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
>> $ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
>> $ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
>> $ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
>> $ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN MELJAST
>> PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884  
>> 9123 2500
>> 4756 ...
>> $ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:  
>> 154 125
>> 101 101 NA 106 125 80 106 101 ...
>> $ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
>> $ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
>> $ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53  
>> 53 56
>> 53 53 53 53 53 ...
>> $ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
>> $ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
>> $ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1  
>> 1 1 26
>> 1 1 1 1 1 ...
>>
>> Hope that explains better...
>> Thank you, m
>>
>> -Original Message-
>> From: David Winsemius [mailto:dwinsem...@comcast.net]
>> Sent: Monday, November 01, 2010 10:13 PM
>> To: Matev¾ Pavliè
>> Cc: r-help@r-project.org
>> Subject: Re: [R] spliting first 10 words in a string
>>
>>
>> On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:
>>
>>> Hi all,
>>>
>>>
>>>
>>> I have a columnn with text that has quite a few words in it. I would
>>> like to split these words in separate columns, but just first ten
>>> words in the string. Is that possible in R?
>>>
>>>
>>
>> Not sure what a column means to you. It's not a precisely defined R
>> type or class. (And you are requested to offered a concrete example
>> rather than making us guess.)
>>
>>> words <-"I have a columnn with text that has quite a few words in
>> it. I would like to split these words in separate columns, but just
>> first ten words in the string. Is that possible in R?"
>>
>>> strsplit(words, " ")[[1]][1:10]
>> [1] "I"   "have""a"   "columnn" "with""text"
>> "that""has" "quite"   "a"
>>
>>
>> Or if in a dataframe:
>>
>>> words <-c("I have a columnn with text that has quite a few words in
>> it.",   "I would like to split these words in separate columns", "but
>> just first ten words in the string. Is that possible in R?")
>>> worddf <- data.frame(words=words)
>>
>>> t(sapply(strsplit(worddf$words, " "), "[", 1:10) )
>> [,1]  [,2][,3][,4]  [,5][,6][,7][,
>> 8]  [,9]   [,10]
>> [1,] "I"   "have"  "a" "columnn" "with"  "text"  "that"  "has"
>

Re: [R] multicore package: help

2010-11-02 Thread Patrick Connolly
On Mon, 01-Nov-2010 at 06:10PM -0400, Fahim M wrote:

|> I have matrices as below:
|> 
|> a <- matrix(c(1:10, 11, 12), 3,4)
|> aa <- data.frame(a)
|> 
|> b <- matrix(c(10:20, 21), 4,3)
|> bb <- data.frame(b)
|> ...
|> and many more matrices.
|> 
|> st = list(aa,bb, . )

There's probably a tidier way to do it, but without knowing what sort
of thing you want to do, I probably don't have the best way of doing
it, but the following should help you.

You don't need a list in this case.  A simple vector will suffice.
 
st <- c("aa", "bb", etc...)


|> 
|> mclapply(st, FUN, mc.cores=6); #this function apply the function to the
|> elements of the list 'aa', 'bb'...etc
|> 
|> 
|> FUN = function(st)

Use something different from st.  Whatever you call it will be the
individual values.

FUN <- function(x){
ind <- which(st == x) # which is the index you want.
mat.x <- get(x) # which will be the dataframe for that part of your list.

... etc...

}

then assign the output of mclapply to a list

out.list <-  mclapply(st, FUN, mc.cores=6)

You'll probably find it useful to name its elements like this:

names(out.list) <- st




HTH




|>  {
|>  Is there a way/function to know the index of st(the list) currently
|> processed by this function as these matrices  are processed in the order of
|> availability of processors?
|> for example, if matrix bb is being processed then the index that I want is
|> 2.
|>  ...
|>  ...
|>  ...
|> 
|>  }
|> 
|>  [[alternative HTML version deleted]]
|> 
|> __
|> R-help@r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 2:32 PM, Rainer Hurling wrote:


On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:


Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this  
only

works with one colour or multiples of 5 colours:


-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color  
works",

panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors  
do

NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors  
do

work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?




Thanks for answering.

You have eight boxes to fill and 8 dots to color. You can either  
supply
8 distinct colors or you can supply some lesser number and they  
will be

recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the  
dots as

the same colors as the fill.


It was not my intention to get the dots coloured in the same colour  
as the boxes. Instead I am looking for a method to fill the boxes  
with a predefined set of different colours (from a color vector). As  
far as I can see this is only possible for one colour and multitudes  
of five colours.


Huh? My example used 4 colors. It should have worked with eight colors  
as well. There are eight groups and




The dots should remain uncoloured ...


Then leave out the col= argument (assuming uncolored means black.)




This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh  
sequence

of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})


In your example you can see that the dots colors are painted in the  
right (reversed) order, the boxes are painted as sequence  
c("yellow","pink","green","blue") instead of  
c("yellow","blue","green","pink").


I do not understand how to turn over a given order and with a given  
count of colours to the boxes.


See if this example using selected colors() works to make it clearer:

> colors()[(2:9)*10]
[1] "bisque1" "blue4"   "burlywood3"  "chartreuse3" "coral3"
[6] "cyan2"   "darkgray""darkorange"


bp5 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do  
work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)
   panel.bwplot(fill=colors()[(2:9)*10], ...)
  })

bp5

(Needed to avoid the first colors() because they were mostly variants  
of "white".

> colors()[1:8]
[1] "white" "aliceblue" "antiquewhite"  "antiquewhite1"
[5] "antiquewhite2" "antiquewhite3" "antiquewhite4" "aquamarine"




Thanks in advance,
Rainer Hurling




--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object ".trPaths" not found

2010-11-02 Thread Spackenkasper

I had the problem as well.
It seems that the reason was that Windows doesn't allow "ordinary"
administrators to edit files in the installation drive C: . 
So I - and Tinn-R - couldn't edit the files in the R directory "etc".

You can circumvent it by restarting your system with User Account Control
(Benutzerkontensteuerung) switched off. Edit the file 

/etc/Rconfig.site 

as described above and it should run. 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/object-trPaths-not-found-tp896933p3024219.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:


Hi all,

Thanks for all the help. I managed to do it with what Gaj suggested  
(Excel :().


The last solution from David is also freat i just don't undestand  
why R  put the words in 14 columns and thre rows?


Because the maximum number of words was 14 and the fill argument was  
TRUE. There were three rows because there were three items in the  
supplied character vector.


I would like it to put just the first 10 words in source field to 10  
diefferent destiantion fields, but the same row. And so on...is that  
possible?


I don't know what a destination field might be. Those are not R data  
types.


This would trim the extra columns (in this example set to those  
greater than 8) by adding a lot of "NULL"'s to the end of a colClasses  
specification  at the expense of a warning message which can be  
ignored:


> read.table(textConnection(words), fill=T, colClasses =  
c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )

   V1V2V3  V4V5V6V7  V8
1   I  have a columnn  with  text  that has
2   I would  like  to split these words  in
3 but  just first ten wordsin   the string.
Warning message:
In read.table(textConnection(words), fill = T, colClasses =  
c(rep("character",  :

  cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
> first8 <- read.table(textConnection(words), fill=T, colClasses =  
c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)

> var1 <- first8[[1]]
> var1
[1] "I"   "I"   "but"

--
David.



Thank you, m
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of David Winsemius

Sent: Tuesday, November 02, 2010 3:47 PM
To: Gaj Vidmar
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:


Though  in this list, in Excel it's just (literally!)
five clicks
away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so
desired)


You could do the same thing without needing to leave R. Just
read.table( textConnection(..), header=FALSE, fill=TRUE)


read.table(textConnection(words), fill=T)

   V1V2V3  V4V5V6V7  V8   V9
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.
2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat
possiblein  R?



Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended
mainly
in statistics,
I did two years of geology, so (and also because of knowing what the
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè"  wrote in message
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...

Hi,

I am sorry, will try to be more exact from now on...

I have a data.frame  with a field called Opis. IT contains
sentenses that
I would like to split in words or fields in data.frame...when I say
columns I mean as in Excel table. I would like to split "Opis" into
ten
fields from the first ten words in Opis field.
Here is an example of my data.frame.

'data.frame':   22928 obs. of  12 variables:
$ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
$ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
$ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN  
MELJAST

PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884
9123 2500
4756 ...
$ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:
154 125
101 101 NA 106 125 80 106 101 ...
$ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53
53 56
53 53 53 53 53 ...
$ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
$ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
$ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1
1 1 26
1 1 1 1 1 ...

Hope that explains better...
Thank you, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, November 01, 2010 10:13 PM
To: Matev¾ Pavliè
Cc: r-help@r-project.org
Subject: Re: [R] spliting first 10 words in a string


On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:


Hi all,



I have a columnn with text that has quite a few words in it. I  
would

like to split these words in separate columns, but just first ten
words in the string. Is that possible

[R] Multiple imputation for nominal data

2010-11-02 Thread John Sorkin
I am looking for an R function that will run multiple imputation (perhaps fully 
conditional imputation, MICE, or sequential generalized regression) for non-MVN 
data, specifically nominal data. My dependent variable is dichotomous, all my 
predictors are nominal. I have a total of 4,500 subjects, 1/2 of whom are 
missing the main independent variables. I would appreciate any suggestions that 
the users of the listserver might have.
John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] predict() for plm?

2010-11-02 Thread max . e . brown
Hi,

I have a small N large T panel which I am estimating via plm, with fixed
effects.

Is there any way to get predicted values for a new dataset? (I want to
estimate parameters on a subset of my sample, and then use these to
calculate model-implied values for the whole sample).

Alternatively, is there some way of extracting the fixed effects from
the plm fitted model object (then I can calculate the predicted values
myself)?

Thanks.

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread steven mosher
 Thanks david.

  Matevz, maybe I can help explain by doing a very simple and brute force
approach
as opposed to  the way david did it. But you should learn his methods.

I will just do a subset of your problem and if you understand how it works
then you should
be able to get something done and then make it more elegant.

First, I simplify the problem by separating out the "sentence" column.

You can do this with your data frame by simply doing this

MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)

so I take your original data.frame (yourbigDF) and I just create a copy of
that one column
 $Opis

Later we can merge the two back together after I add 10 columns for the
words


Lets make some dummy data with just 10 rows



 sentence<- "this is a sentence with ten words or maybe more than ten words"
 sentV<-rep(sentence,10)
# now I just made 10 rows of the same sentence
# NEXT because I am going to create 10 new colums of 10 rows I create
# 10 vectors> each is named and each has 10 elements For the rows.
# they have NO DATA in them

 
first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

#Next I create a dataframe with Sentence in the first column and 10 blank
colums.
# NOTE I use stringsAsFactors=False

 DF
<-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

# This is what it would look like ( the first row)
DF[1,]

Sentence first second third fourth fifth sixth seventh eighth ninth tenth
1 this is a sentence with ten words or maybe more than ten words FALSE
 FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE

Next, I will show you how to assign the first ten words to the 10 blank
columns

DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

#DF[1,2:11]  selects the columns 2-11 of the first row
#strsplit  returns the first 10 words [1:10] and place them in the
columsn2-11

If you want to do this the slow way you can just loop through your dataframe
row by row
or you can probably use apply.

Make more sense?
> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
> DF[1,]
Sentence first
second third   fourth fifth sixth seventh eighth ninth tenth
1 this is a sentence with ten words or maybe more than ten words  this
is a sentence  with   ten   words or maybe  more
> DF[1,"first"]
[1] "this"

On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius wrote:

>
> On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:
>
>  Hi all,
>>
>> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
>> :().
>>
>> The last solution from David is also freat i just don't undestand why R
>>  put the words in 14 columns and thre rows?
>>
>
> Because the maximum number of words was 14 and the fill argument was TRUE.
> There were three rows because there were three items in the supplied
> character vector.
>
>
>  I would like it to put just the first 10 words in source field to 10
>> diefferent destiantion fields, but the same row. And so on...is that
>> possible?
>>
>
> I don't know what a destination field might be. Those are not R data types.
>
> This would trim the extra columns (in this example set to those greater
> than 8) by adding a lot of "NULL"'s to the end of a colClasses specification
>  at the expense of a warning message which can be ignored:
>
> > read.table(textConnection(words), fill=T, colClasses = c(rep("character",
> 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )
>
>   V1V2V3  V4V5V6V7  V8
> 1   I  have a columnn  with  text  that has
> 2   I would  like  to split these words  in
> 3 but  just first ten wordsin   the string.
> Warning message:
> In read.table(textConnection(words), fill = T, colClasses =
> c(rep("character",  :
>  cols = 14 != length(data) = 38
>
>
> If you want to assign the first column to a variable then just:
> > first8 <- read.table(textConnection(words), fill=T, colClasses =
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)
> > var1 <- first8[[1]]
> > var1
> [1] "I"   "I"   "but"
>
> --
> David.
>
>
>
>> Thank you, m
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of David Winsemius
>> Sent: Tuesday, November 02, 2010 3:47 PM
>> To: Gaj Vidmar
>> Cc: r-h...@stat.math.ethz.ch
>> Subject: Re: [R] spliting first 10 words in a string
>>
>>
>> On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:
>>
>>  Though  in this list, in Excel it's just (literally!)
>>> five clicks
>>> away!
>>> (with the column in question selected)
>>> Data -> Text to Columns -> Delimited -> tick Space -> Finish
>>> Pa je! (~Voila in Slovenian)
>>> (then import back to R, keeping only the first 10 columns if so
>>> desired)
>>>
>>
>> You could do the same thing without needing to leave R. Just
>> read.table( textConnection(..), header=FALSE, fill=TRUE)
>>
>>  read.table(textConnection(words), fi

Re: [R] visualize TukeyHSD results

2010-11-02 Thread Mendiburu, Felipe (CIP)
Dear Timothy,

Use library(agricolae)
> library(agricolae)
> a = aov(Weight~Feed)
> HSD.test(a,"Feed")

HSD.test(a,"Feed", group=TRUE)
HSD.test(a,"Feed", group=FALSE)

Regards,

Felipe de Mendiburu.
http://tarwi.lamolina.edu.pe/~fmendiburu
International Potato Center. www.cipotato.org
University: Agraria La Molina - Peru. www.lamolina.edu.pe

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Timothy Spier
Sent: Thursday, October 21, 2010 9:50 PM
To: r-help@r-project.org
Subject: [R] visualize TukeyHSD results


I am a new R user but a long time SAS user. I searched for a response to
this question but no luck, so forgive me if this topic has been covered
before. I am running a TukeyHSD post hoc test after running an ANOVA. I
get the results of all pairwise comparisons, no problem. However, the
output table is a little "busy", and I'd like to make the output easier
to read. Specifically, I would like all groups which are not
significantly different to be given the same letter. 

For example, here is a simple ANOVA with Tukey post hoc. It compares
weight gain in pigs among 4 feeds labeled "A", "B", "C", and "D":

> a = aov(Weight~Feed)
> TukeyHSD(a)
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Weight ~ Feed)

$Feed
  difflwr   upr p adj
B-A   6.68   1.096263 12.263737 0.0168421
C-A   8.73   2.807553 14.652447 0.0034914
D-A  -1.38  -6.963737  4.203737 0.8906642
C-B   2.05  -3.872447  7.972447 0.7530266
D-B  -8.06 -13.643737 -2.476263 0.0041505
D-C -10.11 -16.032447 -4.187553 0.0009497



What I really want would look something like this:

Feed Mean TukeyResult
C73.4   a
B71.3   a
A64.6   b
D63.2   b


Any ideas?
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting first 10 words in a string

2010-11-02 Thread Matevž Pavlič
Hi, 

Ok, i got this now. At least i think so. I got a data.frame with 15 fields, all 
other words have bee truncated. Which is what i want. But ia have that in a 
seperate data.frame from that one it was before (would be nice if it would be 
in the same ...) 

'data.frame':   22801 obs. of  15 variables:
 $ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...
 $ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...
 $ V3 : chr  "HUMUSNA" "PEŠČEN" "MELJAST" ",KONGLOMERAT," ...
 $ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...
 $ V5 : chr  "Z" "DO" "DO" "S" ...
 $ V6 : chr  "MALO" "r" "r" "PLASTMI" ...
 $ V7 : chr  "PODA," "=" "=" "GFs," ...
 $ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...
 $ V9 : chr  "GNETNA," "mm," "S" "" ...
 $ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...
 $ V11: chr  "" "PRODNIKI" "MALO" "" ...
 $ V12: chr  "" "DO" "PEŠČEN" "" ...
 $ V13: chr  "" "R" "S" "" ...
 $ V14: chr  "" "=" "TANKIMI" "" ...

Now, i have another problem. Is it possible to count which word occours most 
often each field (V1, V2, V3, ...) and which one is the second and so on. 
Ideally to create a table for each field (V1, V2, V3, ...) with the word and 
thenumber of occuraces in that field (column) . 
I suppose it could be done in SQL, but what since i saw what R can do i guess 
this can be done here to?

Thanks, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Tuesday, November 02, 2010 8:23 PM
To: Matevž Pavlič
Cc: Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:

> Hi all,
>
> Thanks for all the help. I managed to do it with what Gaj suggested 
> (Excel :().
>
> The last solution from David is also freat i just don't undestand why 
> R  put the words in 14 columns and thre rows?

Because the maximum number of words was 14 and the fill argument was TRUE. 
There were three rows because there were three items in the supplied character 
vector.

> I would like it to put just the first 10 words in source field to 10 
> diefferent destiantion fields, but the same row. And so on...is that 
> possible?

I don't know what a destination field might be. Those are not R data types.

This would trim the extra columns (in this example set to those greater than 8) 
by adding a lot of "NULL"'s to the end of a colClasses specification  at 
the expense of a warning message which can be
ignored:

 > read.table(textConnection(words), fill=T, colClasses = c(rep("character", 
 > 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )
V1V2V3  V4V5V6V7  V8
1   I  have a columnn  with  text  that has
2   I would  like  to split these words  in
3 but  just first ten wordsin   the string.
Warning message:
In read.table(textConnection(words), fill = T, colClasses = c(rep("character",  
:
   cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
 > first8 <- read.table(textConnection(words), fill=T, colClasses = 
 > c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)  > var1 
 > <- first8[[1]]  > var1
[1] "I"   "I"   "but"

--
David.

>
> Thank you, m
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org
> ] On Behalf Of David Winsemius
> Sent: Tuesday, November 02, 2010 3:47 PM
> To: Gaj Vidmar
> Cc: r-h...@stat.math.ethz.ch
> Subject: Re: [R] spliting first 10 words in a string
>
>
> On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:
>
>> Though  in this list, in Excel it's just (literally!) five 
>> clicks away!
>> (with the column in question selected) Data -> Text to Columns -> 
>> Delimited -> tick Space -> Finish Pa je! (~Voila in Slovenian) (then 
>> import back to R, keeping only the first 10 columns if so
>> desired)
>
> You could do the same thing without needing to leave R. Just 
> read.table( textConnection(..), header=FALSE, fill=TRUE)
>
>> read.table(textConnection(words), fill=T)
>V1V2V3  V4V5V6V7  V8   V9
> V10  V11   V12 V13 V14
> 1   I  have a columnn  with  text  that hasquite
> a  few words  in it.
> 2   I would  like  to split these words  in separate columns
> 3 but  just first ten wordsin   the string.   Isthat
> possiblein  R?
>
>>
>> Regards,
>> Assist. Prof. Gaj Vidmar, PhD
>> University Rehabilitattion Institute, Republic of Slovenia
>>
>> Irrelevant P.S. Long ago, before embarking on what eventually ended 
>> mainly in statistics, I did two years of geology, so (and also 
>> because of knowing what the poster's institute does) I even kinda 
>> imagine what these data are.
>>
>> "Matev¾ Pavliè"  wrote in message 
>> news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...
>>> Hi,
>>>
>>> I am sorry, will try to be more exact from now on...
>>>
>>> I have a data.frame  with a field called Opis. IT contains sentenses 
>>> that I would like to split in words or

[R] multi-level cox ph with time-dependent covariates

2010-11-02 Thread Mattia Prosperi
Dear all,

I would like to know if it is possible to fit in R a Cox ph model with
time-dependent covariates and to account for hierarchical effects at
the same time. Additionally, I'd like also to know if it would be
possible to perform any feature selection on this model fit.

I have a data set that is composed by multiple marker measurements
(and hundreds of covariates) at different time points from different
tissue samples of different patients. Suppose that the data were
coming from animal model with very few subjects (n=6) that were
followed up given a pathogen exposure, measured several times,
sampling different tissues in the same days, until a certain outcome
was reached (or outcome censored). Suppose that the pathogen can vary
over time (might be a bacteria that selects for drug-resistance) and
that also it can vary across different tissue reservoirs within the
same patient.

In other words: names(data) = patient_id, start_time, stop_time,
tissue_id, pathogen_type, marker1, ..., marker100, ..., outcome

If I had multiple observations per patient at different time
intervals, I would model it like this (hope it is correct)

model<-coxph(Surv(start_time,stop_time,outcome)~all_covariates+cluster(patient_id))

But now I have both the patient and the tissue, and hundreds of
different variables. I thought I could use the coxme library, since it
has also a ridge regression feature. Shall I then model nested random
effects by considering both the patient_id and the tissue_id?

Like model<-coxme(Surv(start_time,stop_time,outcome) ~ covariates + (1
| patient_id/tissue_id))

Then, how could I shrink the coefficients in order to select a subset
of them with non-neglegible effects? May I also consider the
possibility to run an AIC-based forward-backward selection?

thanks and apologies if I am completely out of the trails,

M.P.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple imputation for nominal data

2010-11-02 Thread Andrew Miles
There are a couple of packages that do MI, including MI for nominal  
data.  The most recent of these is "mi", but I believe "mice" might do  
it as well.  Both are available on the CRAN, and both have useful  
articles that teach you how to use them.  The citations for these  
articles can be found at the bottom of the help page that appears by  
typing


?mi
OR for mice
?mice

mi is the newer package and has some useful control features, but as  
it is newer it still is under development.


Andrew Miles


On Nov 2, 2010, at 3:38 PM, John Sorkin wrote:

I am looking for an R function that will run multiple imputation  
(perhaps fully conditional imputation, MICE, or sequential  
generalized regression) for non-MVN data, specifically nominal data.  
My dependent variable is dichotomous, all my predictors are nominal.  
I have a total of 4,500 subjects, 1/2 of whom are missing the main  
independent variables. I would appreciate any suggestions that the  
users of the listserver might have.

John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped: 
6}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 20:08 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 2:32 PM, Rainer Hurling wrote:


On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:


Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this only
works with one colour or multiples of 5 colours:


-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do
NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?




Thanks for answering.


You have eight boxes to fill and 8 dots to color. You can either supply
8 distinct colors or you can supply some lesser number and they will be
recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the dots as
the same colors as the fill.


It was not my intention to get the dots coloured in the same colour as
the boxes. Instead I am looking for a method to fill the boxes with a
predefined set of different colours (from a color vector). As far as I
can see this is only possible for one colour and multitudes of five
colours.


I think first I have to apologise for my bad english. Sorry for any 
misunderstandig.



Huh? My example used 4 colors. It should have worked with eight colors
as well. There are eight groups and


Yes, all is ok with your example. My only problem is, the these four 
colours are not ordered as given by the vector (see below).



The dots should remain uncoloured ...


Then leave out the col= argument (assuming uncolored means black.)


I used these coloured dots to explain, that ordered colours (from given 
vector) work with dots, but not with the boxes.



This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh sequence
of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})


In your example you can see that the dots colors are painted in the
right (reversed) order, the boxes are painted as sequence
c("yellow","pink","green","blue") instead of
c("yellow","blue","green","pink").

I do not understand how to turn over a given order and with a given
count of colours to the boxes.


See if this example using selected colors() works to make it clearer:

 > colors()[(2:9)*10]
[1] "bisque1" "blue4" "burlywood3" "chartreuse3" "coral3"
[6] "cyan2" "darkgray" "darkorange"


bp5 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(fill=colors()[(2:9)*10], ...)
})

bp5

(Needed to avoid the first colors() because they were mostly variants of
"white".
 > colors()[1:8]
[1] "white" "aliceblue" "antiquewhite" "antiquewhite1"
[5] "antiquewhite2" "antiquewhite3" "antiquewhite4" "aquamarine"


Of course your example with eight colours works, too. But as you can see 
in the plot, the colours have different order then in the vector 
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured 
"bisque1", the second box (bass1) "blue4" and so on.


I hope, this explaination is a bit clearer than my preceding ones.


Thanks in advance,
Rainer Hurling


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coxph linear.predictors

2010-11-02 Thread Bond, Stephen
Re: 1. X*beta  != linear.predictor.  

The equality is stated in three different help docs, which is misleading, 
especially in light of the way glm is set up. I felt like was wrestling with 
SAS :-)
The relative risk was the original idea behind cox regression, but it can be 
used for many non-relative purposes. If we want to calculate death probability 
in each period, then lp is no longer shift invariant.
 
Re: 2. Survfit is too slow.
It seems that the implementation follows the procedure in the original Cox 
paper, which calls iterative optimization for each death time.
My subjects are mortgages and both the estimation and the prediction samples 
are several hundred thousand. The call appears to recalculate/optimize 
everything even though only the $surv changes. Since each subject belongs to a 
single strata, most of the calculations are redundant.
I am not much of a programmer and could never figure out how to use the R 
profiler, so cannot be exact here, but the simple exponentiation takes no time 
and survfit takes several secs for each subject.
So I did:

survlong <- survfit(modlong) # a single call suffices
bl1 <- c(1,cumsum(survlong$strata)+1)
bl2 <- cumsum(survlong$strata) # get the start and end of each strata
for (jj in 1:nrow(newapp)){

  strat=as.integer(newapp[jj,"termfac"])
  surv <- survlong$surv[(b1[strat]):(b2[strat])] # extract the strata
  risk <- predict(modlong,new=newapp[jj,],type="risk")# it seems there is no
  # optimization here
  newsurv <- surv^risk # we done
... rest of code
}

As a package maintainer, you have to decide whether including any of the above 
and below is useful or users can figure out things on their own. Or maybe 
survfit can be made smart and subsequent calls on the same model will use the 
first call to survfit?? It's your call :-)

Kind regards

Stephen B
-Original Message-
From: Terry Therneau [mailto:thern...@mayo.edu] 
Sent: Thursday, October 28, 2010 6:39 PM
To: Bond, Stephen; David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] coxph linear.predictors

Gentlemen,
  I read R-news in batch mode so I'm often a day behind.  Let me try to
answer some of the questions.

 1. X*beta  != linear.predictor.  
I'm sorry if the documentation isn't all it could be.  Between the book,
tech report, and help I've written about 400 pages, but this particular
topic isn't yet in it.  The final snipe about being "opaque like SAS"
was really unfair.
The Cox model is a relative risk model, if lp is a linear predictor then
so is lp +c for any constant; they are equally good and equally valid.
The linear.predictor component in a coxph fit is (X-means) * beta.  The
computation exp(lp) occurs multiple times downstream and this keep the
exp function from overflowing when there is something like a Date object
as a predictor.  Adding this constant changes not a single downstream
calcuation.

2. Survfit is too slow.
 I'd like to hear more about this.  My work mostly involves modest data
sets so perhaps I haven't seen it.  Accuracy and maintainability have
been my first worries.

3. Baseline survival.
 Let xbase be a particular set of values for the x covariates (one for
each).  The survival curve for a given xbase is obtained from survfit
   fit <- coxph(
   sfit <- survfit(fit, newdata=xbase)
   chaz <- -log(sfit$surv)  #cumulative hazard
(The xbase vector will need to have variable names for the function to
know which value goes to which of course).

The cumulative hazard for any other subject will be 
   newhaz <- chaz * exp(fit$coef%*% (x-xbase))
There is not a simple transformation of the standard error from one fit
to another, however.  You will need to call survfit with a data frame
for newdata, which will return one curve per row with the proper values.

In my view there is no such thing as "A" baseline survival curve.  Any
xbase you chose is a baseline.  However, it is wise to choose something
near the center of the data space in order to avoid numeric problems
with the exp function above.  I would never ever chose a vector of
zeros, although some text books do -- it saves them about 8 characters
of typing in the newhaz formula above.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density() function: differences with S-PLUS

2010-11-02 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola 
> Sturaro Sommacal (Quantide srl)
> Sent: Tuesday, November 02, 2010 3:05 AM
> To: r-help@r-project.org
> Subject: [R] density() function: differences with S-PLUS
> 
> Hello!
> 
> Someone know what are the difference between R and S-PLUS in 
> the density()
> function?
> 
> For example, I would like to reply this simple S-PLUS code in 
> R, but I don't
> understand which parameter I should modify to get the same results.
> 
> S-PLUS CODE:
> density(1:1000, width = 4)
> 
> R-CODE:
> density(1:1000, bw = 4, window = "g",  n = 50, cut = 0.75)
> 
> I obtain the same x values, but different y values. I try 
> also different
> examples, with different parameter.

I needed to use the to= and from= arguments to get the same
set of x values in R and S+.  E.g.,
  z <- density(x=0, width=3, window="gaussian",
 n=2001, from=-10, to=10, cut=0.75)
gave identical x outputs in R and S+.  By using x=0
you can see the difference in the gaussian-based kernel
used by R and Splus:
  plot(z$x, z$y, pch=".", log="y")
Splus, as its help("density") states", uses a truncated
Gaussian kernel:
  "The "gaussian" window is truncated at 4
  standard deviations (and then scaled
  appropriately to adjust for the truncated
  area)."
R appears to not truncate the Gaussian kernel. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> Can you help me?
> 
> Thank you in advance.
> 
> Nicola Sturaro
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ForestPlot or similar

2010-11-02 Thread Mestat

Thanks Matt,
I am having a problem now to use this function. The function separately
works fine. But the problem is that I am working with a simulation, so i
placed the CREDPLOT function in my program and added the following commands
according my data:

#MY DATA, ESTIMATES, LOWER AND UPPER INTERVALS
rw_cibas_quantile_ori_m<-rw_quantile_app_ori[-51:-1000]
rw_cibas_low_quantile_ori_l<-rw_cibas_low_quantile_ori[-51:-1000]
rw_cibas_up_quantile_ori_u<-rw_cibas_up_quantile_ori[-51:-1000]

#GRAPHIC
jpeg ('Nfp_rw_bas_quantile_ori.jpeg')
forestplot(rw_cibas_quantile_ori_m,rw_cibas_low_quantile_ori_l,rw_cibas_up_quantile_ori_u,cen=403.677)
dev.off()

My program is running fine, but I am not getting any graphic. I did the
graphic using the function FORESTPLOT, but the graphic provided by the
function CREDPLOT is much better. Here is my code:

rw_ciper_gini_ori_m<-rw_gini_app_ori[-51:-1000]
rw_ciper_low_gini_ori_l<-rw_ciper_low_gini_ori[-51:-1000]
rw_ciper_up_gini_ori_u<-rw_ciper_up_gini_ori[-51:-1000]
tabletext<-cbind(c(rep(" ",50),NA))
rw_ciper_gini_ori_m<-c(rw_ciper_gini_ori_m,NA)
rw_ciper_low_gini_ori_l<-c(rw_ciper_low_gini_ori_l,NA)
rw_ciper_up_gini_ori_u<-c(rw_ciper_up_gini_ori_u,NA)
jpeg ('Sfp_rw_per_gini_ori.jpeg')
forestplot(tabletext,rw_ciper_gini_ori_m,rw_ciper_low_gini_ori_l,rw_ciper_up_gini_ori_u,zero=0.4,col=meta.colors(box="royalblue",line="darkblue"))
dev.off()

Any information about whats is missing/wrong in order to obtain the graphic
with the function CREDPLOT is welcomed.
Thanks is advance,
Marcio
-- 
View this message in context: 
http://r.789695.n4.nabble.com/ForestPlot-or-similar-tp3020374p3024354.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple imputation for nominal data

2010-11-02 Thread John Sorkin
Thank you!
John




John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Andrew 
Miles  11/2/2010 3:59 PM >>>
There are a couple of packages that do MI, including MI for nominal  
data.  The most recent of these is "mi", but I believe "mice" might do  
it as well.  Both are available on the CRAN, and both have useful  
articles that teach you how to use them.  The citations for these  
articles can be found at the bottom of the help page that appears by  
typing

?mi
OR for mice
?mice

mi is the newer package and has some useful control features, but as  
it is newer it still is under development.

Andrew Miles


On Nov 2, 2010, at 3:38 PM, John Sorkin wrote:

> I am looking for an R function that will run multiple imputation  
> (perhaps fully conditional imputation, MICE, or sequential  
> generalized regression) for non-MVN data, specifically nominal data.  
> My dependent variable is dichotomous, all my predictors are nominal.  
> I have a total of 4,500 subjects, 1/2 of whom are missing the main  
> independent variables. I would appreciate any suggestions that the  
> users of the listserver might have.
> John
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:18}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can  
see in the plot, the colours have different order then in the vector  
'colors()[(2:9)*10]' itself. I expected the first box (bass2)  
coloured "bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the  
panel handles it in the manner you expect:


bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg  
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3' 'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],

  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
   })
 bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".





Thanks in advance,
Rainer Hurling


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Display of NAs in character columns of a data frame under fix() or edit().

2010-11-02 Thread Rolf Turner

Example:

xxx <- data.frame(x=1:26,y=letters)
xxx$x[c(2,4,6,8)] <- NA
xxx$y[c(1,3,5,7)] <- NA

yyy <- edit(yyy)

The missing values in xxx$y appear as blanks in the spreadsheet window that
appears, whereas the missing values in the numeric column "x" appear as "NA"
(as I would expect).

Is this a bug or a feature?

cheers,

Rolf Turner

P.S.

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/C/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] datasets  utils stats graphics  grDevices methods   base 

other attached packages:
[1] misc_0.0-13 gtools_2.6.2spatstat_1.20-5 deldir_0.0-12  
[5] mgcv_1.6-2  fortunes_1.4-0  MASS_7.3-8 

loaded via a namespace (and not attached):
[1] grid_2.12.0lattice_0.19-13Matrix_0.999375-44 nlme_3.1-97   
[5] tools_2.12.0  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting First 10 words in a string

2010-11-02 Thread Matevž Pavlič
Hi Steven, 

 

Thank you for the help. I get an error though when i do this :

 

>lit<-read.csv("litologija.csv", sep=";", dec=".")

>sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)

>str(sent)

>sentV<-rep(sent,10)

>str(sentV)

 

>first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

>DF 
><-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

 

»Error in data.frame(Sentence = sent, first, second, third, fourth, fifth,  : 

arguments imply differing number of rows: 22928, 10«

 

What am I doing wrong?

 

Thnks, m

 

 

 

From: steven mosher [mailto:mosherste...@gmail.com] 
Sent: Tuesday, November 02, 2010 8:45 PM
To: David Winsemius
Cc: Matevž Pavlič; Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string

 

 Thanks david.

 

  Matevz, maybe I can help explain by doing a very simple and brute force 
approach

as opposed to  the way david did it. But you should learn his methods.

 

I will just do a subset of your problem and if you understand how it works then 
you should

be able to get something done and then make it more elegant.

 

First, I simplify the problem by separating out the "sentence" column.

 

You can do this with your data frame by simply doing this

 

MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)

 

so I take your original data.frame (yourbigDF) and I just create a copy of that 
one column

 $Opis

 

Later we can merge the two back together after I add 10 columns for the words

 

 

Lets make some dummy data with just 10 rows

 

 

 

 sentence<- "this is a sentence with ten words or maybe more than ten words"

 sentV<-rep(sentence,10)

# now I just made 10 rows of the same sentence

# NEXT because I am going to create 10 new colums of 10 rows I create

# 10 vectors> each is named and each has 10 elements For the rows.

# they have NO DATA in them

 

 
first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

 

#Next I create a dataframe with Sentence in the first column and 10 blank 
colums.

# NOTE I use stringsAsFactors=False

 

 DF 
<-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

 

# This is what it would look like ( the first row)

DF[1,]

 

Sentence first second third fourth fifth sixth seventh eighth ninth tenth

1 this is a sentence with ten words or maybe more than ten words FALSE  FALSE 
FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE

 

Next, I will show you how to assign the first ten words to the 10 blank columns

 

DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

 

#DF[1,2:11]  selects the columns 2-11 of the first row

#strsplit  returns the first 10 words [1:10] and place them in the columsn2-11

 

If you want to do this the slow way you can just loop through your dataframe 
row by row

or you can probably use apply.

 

Make more sense?

> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

> DF[1,]

Sentence first second 
third   fourth fifth sixth seventh eighth ninth tenth

1 this is a sentence with ten words or maybe more than ten words  this is   
  a sentence  with   ten   words or maybe  more

> DF[1,"first"]

[1] "this"

 

On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius  wrote:


On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:

Hi all,

Thanks for all the help. I managed to do it with what Gaj suggested (Excel :().

The last solution from David is also freat i just don't undestand why R  put 
the words in 14 columns and thre rows?

 

Because the maximum number of words was 14 and the fill argument was TRUE. 
There were three rows because there were three items in the supplied character 
vector.

 

I would like it to put just the first 10 words in source field to 10 
diefferent destiantion fields, but the same row. And so on...is that possible?

 

I don't know what a destination field might be. Those are not R data types.

This would trim the extra columns (in this example set to those greater than 8) 
by adding a lot of "NULL"'s to the end of a colClasses specification  at 
the expense of a warning message which can be ignored:

> read.table(textConnection(words), fill=T, colClasses = c(rep("character", 8), 
> rep("NULL", 30) ) , stringsAsFactors=FALSE )


  V1V2V3  V4V5V6V7  V8

1   I  have a columnn  with  text  that has

2   I would  like  to split these words  in

3 but  just first ten wordsin   the string.

Warning message:
In read.table(textConnection(words), fill = T, colClasses = c(rep("character",  
:
 cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
> first8 <- read.table(textConnection(words), fill=T, colClasses = 
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=F

Re: [R] ForestPlot or similar

2010-11-02 Thread Abhijit Dasgupta
You need to use a print statement

print(forestplot())

Lattice and ggplot2 need to be explicitly printed to get output into 
jpeg. I believe Matt's function only provides the graphics object and 
not the printed version.

Abhijit
On 11/2/2010 4:32 PM, Mestat wrote:
> Thanks Matt,
> I am having a problem now to use this function. The function separately
> works fine. But the problem is that I am working with a simulation, so i
> placed the CREDPLOT function in my program and added the following commands
> according my data:
>
> #MY DATA, ESTIMATES, LOWER AND UPPER INTERVALS
> rw_cibas_quantile_ori_m<-rw_quantile_app_ori[-51:-1000]
> rw_cibas_low_quantile_ori_l<-rw_cibas_low_quantile_ori[-51:-1000]
> rw_cibas_up_quantile_ori_u<-rw_cibas_up_quantile_ori[-51:-1000]
>
> #GRAPHIC
> jpeg ('Nfp_rw_bas_quantile_ori.jpeg')
> forestplot(rw_cibas_quantile_ori_m,rw_cibas_low_quantile_ori_l,rw_cibas_up_quantile_ori_u,cen=403.677)
> dev.off()
>
> My program is running fine, but I am not getting any graphic. I did the
> graphic using the function FORESTPLOT, but the graphic provided by the
> function CREDPLOT is much better. Here is my code:
>
> rw_ciper_gini_ori_m<-rw_gini_app_ori[-51:-1000]
> rw_ciper_low_gini_ori_l<-rw_ciper_low_gini_ori[-51:-1000]
> rw_ciper_up_gini_ori_u<-rw_ciper_up_gini_ori[-51:-1000]
> tabletext<-cbind(c(rep(" ",50),NA))
> rw_ciper_gini_ori_m<-c(rw_ciper_gini_ori_m,NA)
> rw_ciper_low_gini_ori_l<-c(rw_ciper_low_gini_ori_l,NA)
> rw_ciper_up_gini_ori_u<-c(rw_ciper_up_gini_ori_u,NA)
> jpeg ('Sfp_rw_per_gini_ori.jpeg')
> forestplot(tabletext,rw_ciper_gini_ori_m,rw_ciper_low_gini_ori_l,rw_ciper_up_gini_ori_u,zero=0.4,col=meta.colors(box="royalblue",line="darkblue"))
> dev.off()
>
> Any information about whats is missing/wrong in order to obtain the graphic
> with the function CREDPLOT is welcomed.
> Thanks is advance,
> Marcio
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hooks into dynamic help system?

2010-11-02 Thread Kevin Wright
What is the easiest way to modify the dynamic html help files?

For example, I would like to put this link on every page:
/doc/html/packages.html
Or a link to the Rseek engine with the title of the page passed as a
parameter to Rseek.
etc.

Do I re-write Rd2HTML and put it higher in the search path?

Are there hook-like functions that I can use to post-process the html code?


Are there parameters for template-type text strings that can be inserted
into the header or footer of the html page?

Any pointers would be appreciated.

-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stacking effectively - without loops

2010-11-02 Thread Dimitri Liakhovitski
Hello!

I have 2 vectors:

x<-letters[1:5]
y<-1:3

Is there a way - without loops - to create a data frame such that we
repeat the whole "y" within each level of "x" so that it looks like
this:

a 1
a 2
a 3
b 1
b 2
b 3
c 1
c 2
c 3

etc?

Thank you!

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stacking effectively - without loops

2010-11-02 Thread Dimitri Liakhovitski
Never mind - found it: expand.grid(y,x)

On Tue, Nov 2, 2010 at 4:57 PM, Dimitri Liakhovitski
 wrote:
> Hello!
>
> I have 2 vectors:
>
> x<-letters[1:5]
> y<-1:3
>
> Is there a way - without loops - to create a data frame such that we
> repeat the whole "y" within each level of "x" so that it looks like
> this:
>
> a 1
> a 2
> a 3
> b 1
> b 2
> b 3
> c 1
> c 2
> c 3
>
> etc?
>
> Thank you!
>
> --
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stacking effectively - without loops

2010-11-02 Thread Johannes Huesing
Dimitri Liakhovitski  [Tue, Nov 02, 2010 at 
09:57:04PM CET]:
> Hello!
> 
> I have 2 vectors:
> 
> x<-letters[1:5]
> y<-1:3
> 
> Is there a way - without loops - to create a data frame such that we
> repeat the whole "y" within each level of "x" so that it looks like
> this:
> 
> a 1
> a 2
> a 3
> b 1
> b 2
> b 3
> c 1
> c 2
> c 3
> 
> etc?

?expand.grid


-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stacking effectively - without loops

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 4:58 PM, Dimitri Liakhovitski wrote:


Never mind - found it: expand.grid(y,x)


Yes, that is one way and is a way that was illustrated yesterday for a  
very similar question on r-help by (perhaps?) Grothendieck). Another  
way is:


data.frame(lets = rep(letters[1:5], each=3), nums=rep(1:3, 5) )

There are at least two different ways that rep() can be invoked and  
each= is not the default.


--
david.



On Tue, Nov 2, 2010 at 4:57 PM, Dimitri Liakhovitski
 wrote:

Hello!

I have 2 vectors:

x<-letters[1:5]
y<-1:3

Is there a way - without loops - to create a data frame such that we
repeat the whole "y" within each level of "x" so that it looks like
this:

a 1
a 2
a 3
b 1
b 2
b 3
c 1
c 2
c 3

etc?

Thank you!

--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com





--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
"bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the panel
handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".


Thank you for the hint, that it works also outside of the panel. It 
looks like I missed the wood for trees here ;-)


In your latest, special case the colours work. After having a nearer 
look at it I found that your colour vector has length 10 (2:11), and 
only the first eight colours are filled in the boxes.


This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
  main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 
'pink' 'violet' 'brown' 'gold'",

  fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
  main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 
'violet' 'brown' 'gold'",
  fill=c("yellow","blue","green","red","pink","violet","brown","gold", 
NA, NA))


I really do not understand what is going on here,
Rainer


David Winsemius, MD
West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] count different words in a field

2010-11-02 Thread Matevž Pavlič
Hi all, 

 

I started to ask this in the other post, but it is off topis...so here it is 
again.

 

I have a data.frame (created with the helpof this mail list) that looks like 
this :

 

'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PE©ÈEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PE©ÈEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...

 

Is it possible to count which word occours most often in each field (V1, V2, 
V3, ...) and which one is the second and so on. Ideally i would like to create 
a table for each field (V1, V2, V3, ...) with the prevailing word and the 
number of occurancies  of that word in that field (column) . 

 

Hope that explains it ok...

 

Thank you, m

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] count different words in a field

2010-11-02 Thread Matevž Pavlič
Nevermind, i think summary() does this ...



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Matevž Pavlič
Sent: Tuesday, November 02, 2010 10:12 PM
To: r-help@r-project.org
Subject: [R] count different words in a field

Hi all, 

 

I started to ask this in the other post, but it is off topis...so here it is 
again.

 

I have a data.frame (created with the helpof this mail list) that looks like 
this :

 

'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PEŠČEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PEŠČEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...

 

Is it possible to count which word occours most often in each field (V1, V2, 
V3, ...) and which one is the second and so on. Ideally i would like to create 
a table for each field (V1, V2, V3, ...) with the prevailing word and the 
number of occurancies  of that word in that field (column) . 

 

Hope that explains it ok...

 

Thank you, m

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] connecting points into a smooth curve

2010-11-02 Thread Greg Snow
In addition to the other responses you have received, the xspline function may 
also be of use.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of tooblue
> Sent: Monday, November 01, 2010 12:19 AM
> To: r-help@r-project.org
> Subject: [R] connecting points into a smooth curve
> 
> 
> If I have, say, five scatter points and want to connect them together
> into a
> smooth curve.
> I did plot(x,y,type="l"), but the graph is five segments connecting
> with
> each other, but not a smooth curve.
> I wonder if there is a line type that is a curve. Thanks!
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/connecting-
> points-into-a-smooth-curve-tp3021796p3021796.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:


On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2)  
coloured

"bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the  
panel

handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3'

'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".


Thank you for the hint, that it works also outside of the panel. It  
looks like I missed the wood for trees here ;-)


In your latest, special case the colours work. After having a nearer  
look at it I found that your colour vector has length 10 (2:11), and  
only the first eight colours are filled in the boxes.


I don't know why the ordering only is irregularly preserved ...  
apparently in situations where the number of colors is a multiple of  
5. Perhaps a question that Sarkar, Andrews or Ehlers can answer. I  
looked at the code for bwplot and it uses panel.polygon for drawing  
the rectangles. The colors and other graphical parameters are supposed  
to be picked up from the box.rectangle settings in par.settings.  
(Trying to set those alos failed.)  I also looked at panel.polygon and  
do not see a reason for the shuffling of colors.


Wrong order also:
> bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg  
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3' 'cyan2' 'darkgray' 'darkorange", par.settings =  
list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,

+  panel = function(...) {
+panel.grid(v = -1, h = 0)
+panel.bwplot( ...)
+   })
> bp3




This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
 main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green'  
'red' 'pink' 'violet' 'brown' 'gold'",

 fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
 main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'  
'pink' 'violet' 'brown' 'gold'",
  
fill=c("yellow","blue","green","red","pink","violet","brown","gold",  
NA, NA))


I really do not understand what is going on here,


Me either.


Rainer


David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] count different words in a field

2010-11-02 Thread David Winsemius


On Nov 2, 2010, at 5:11 PM, Matevž Pavlič wrote:


Hi all,



I started to ask this in the other post, but it is off topis...so  
here it is again.




I have a data.frame (created with the helpof this mail list) that  
looks like this :




? table
> tbl <- table(c("HUMUS", "SLABO", "MALO", "SLABO"))
> tbl[order(tbl)][1]
HUMUS
1

Just make a function that does this to a vector and use lapply(dfrm,  
func)  on the dataframe.


--
David.




'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PE©ÈEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PE©ÈEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...



Is it possible to count which word occours most often in each field  
(V1, V2, V3, ...) and which one is the second and so on. Ideally i  
would like to create a table for each field (V1, V2, V3, ...) with  
the prevailing word and the number of occurancies  of that word in  
that field (column) .




Hope that explains it ok...



Thank you, m








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Line numbers in Sweave

2010-11-02 Thread Yihui Xie
Hi,

I thumbed through the source code Sweave.R but was unable to figure
out when (under what conditions) R will insert the line numbers to the
output. The R 2.12.0 news said:

• Parsing errors detected during Sweave() processing will now be
  reported referencing their original location in the source file.

Do we have any options to turn off this reporting? Thanks!

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 22:37 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:


On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
"bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the panel
handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".


Thank you for the hint, that it works also outside of the panel. It
looks like I missed the wood for trees here ;-)

In your latest, special case the colours work. After having a nearer
look at it I found that your colour vector has length 10 (2:11), and
only the first eight colours are filled in the boxes.


I don't know why the ordering only is irregularly preserved ...
apparently in situations where the number of colors is a multiple of 5.
Perhaps a question that Sarkar, Andrews or Ehlers can answer. I looked
at the code for bwplot and it uses panel.polygon for drawing the
rectangles. The colors and other graphical parameters are supposed to be
picked up from the box.rectangle settings in par.settings. (Trying to
set those alos failed.) I also looked at panel.polygon and do not see a
reason for the shuffling of colors.


I also hope that someone from 'inner circle' would have a look ;-)


Wrong order also:
 > bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", par.settings =
list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,
+ panel = function(...) {
+ panel.grid(v = -1, h = 0)
+ panel.bwplot( ...)
+ })
 > bp3


Yes, I tried to manipulate box.rectangle myself with also no success. I 
think, the design of panel.bwplot originally allows only for using one 
fill color (just a guess).



This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink'
'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold",
NA, NA))

I really do not understand what is going on here,


Me either.


Thank you so far. I am afraid I have to go to bed. In just a few hours I 
have to work for my employer again ...



Rainer


David Winsemius, MD
West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line numbers in Sweave

2010-11-02 Thread Duncan Murdoch

On 02/11/2010 5:50 PM, Yihui Xie wrote:

Hi,

I thumbed through the source code Sweave.R but was unable to figure
out when (under what conditions) R will insert the line numbers to the
output. The R 2.12.0 news said:

 • Parsing errors detected during Sweave() processing will now be
   reported referencing their original location in the source file.

Do we have any options to turn off this reporting? Thanks!


Sure:  just don't include any syntax errors.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting First 10 words in a string

2010-11-02 Thread steven mosher
 That's easy you are confusing the dummy code I sent.

 Do this:

 lit<-read.csv("litologija.csv", sep=";", dec=".")
sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
irst=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
sent)

I put the length of the vector to 10 just to do a dummy problem.

Then do this:

for(j in 1:nrow(sent) {

  sent[j,2:11]<-strsplit(sent[j,1]," ")[[1]][1:10]

}


That will get you a result the crude brute force way.

try that.

Then you can learn sapply way. but first you need to learn R data
structures.





On Tue, Nov 2, 2010 at 1:47 PM, Matevž Pavlič wrote:

> Hi Steven,
>
>
>
> Thank you for the help. I get an error though when i do this :
>
>
>
> >lit<-read.csv("litologija.csv", sep=";", dec=".")
>
> >sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>
> >str(sent)
>
> >sentV<-rep(sent,10)
>
> >str(sentV)
>
>
>
>
> >first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>
> >DF
> <-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>
>
>
> »Error in data.frame(Sentence = sent, first, second, third, fourth, fifth,
> :
>
> arguments imply differing number of rows: 22928, 10«
>
>
>
> What am I doing wrong?
>
>
>
> Thnks, m
>
>
>
>
>
>
>
> *From:* steven mosher [mailto:mosherste...@gmail.com]
> *Sent:* Tuesday, November 02, 2010 8:45 PM
> *To:* David Winsemius
> *Cc:* Matevž Pavlič; Gaj Vidmar; r-h...@stat.math.ethz.ch
> *Subject:* Re: [R] spliting first 10 words in a string
>
>
>
>  Thanks david.
>
>
>
>   Matevz, maybe I can help explain by doing a very simple and brute force
> approach
>
> as opposed to  the way david did it. But you should learn his methods.
>
>
>
> I will just do a subset of your problem and if you understand how it works
> then you should
>
> be able to get something done and then make it more elegant.
>
>
>
> First, I simplify the problem by separating out the "sentence" column.
>
>
>
> You can do this with your data frame by simply doing this
>
>
>
> MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)
>
>
>
> so I take your original data.frame (yourbigDF) and I just create a copy of
> that one column
>
>  $Opis
>
>
>
> Later we can merge the two back together after I add 10 columns for the
> words
>
>
>
>
>
> Lets make some dummy data with just 10 rows
>
>
>
>
>
>
>
>  sentence<- "this is a sentence with ten words or maybe more than ten
> words"
>
>  sentV<-rep(sentence,10)
>
> # now I just made 10 rows of the same sentence
>
> # NEXT because I am going to create 10 new colums of 10 rows I create
>
> # 10 vectors> each is named and each has 10 elements For the rows.
>
> # they have NO DATA in them
>
>
>
>
>  
> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>
>
>
> #Next I create a dataframe with Sentence in the first column and 10 blank
> colums.
>
> # NOTE I use stringsAsFactors=False
>
>
>
>  DF
> <-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>
>
>
> # This is what it would look like ( the first row)
>
> DF[1,]
>
>
>
> Sentence first second third fourth fifth sixth seventh eighth ninth tenth
>
> 1 this is a sentence with ten words or maybe more than ten words FALSE
>  FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE
>
>
>
> Next, I will show you how to assign the first ten words to the 10 blank
> columns
>
>
>
> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>
>
>
> #DF[1,2:11]  selects the columns 2-11 of the first row
>
> #strsplit  returns the first 10 words [1:10] and place them in the
> columsn2-11
>
>
>
> If you want to do this the slow way you can just loop through your
> dataframe row by row
>
> or you can probably use apply.
>
>
>
> Make more sense?
>
> > DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>
> > DF[1,]
>
> Sentence first
> second third   fourth fifth sixth seventh eighth ninth tenth
>
> 1 this is a sentence with ten words or maybe more than ten words  this
> is a sentence  with   ten   words or maybe  more
>
> > DF[1,"first"]
>
> [1] "this"
>
>
>
> On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius 
> wrote:
>
>
> On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:
>
> Hi all,
>
> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
> :().
>
> The last solution from David is also freat i just don't undestand why R
>  put the words in 14 columns and thre rows?
>
>
>
> Because the maximum number of words was 14 and the fill argument was TRUE.
> There were three rows because there were three items in the supplied
> character vector.
>
>
>
> I would like it to put just the first 10 words in source field to 10
> diefferent destiantion fields, but the same row. And so on...is that
> possible?
>
>
>
> I don't know what a destination field might be. Those are not R data types.
>

Re: [R] splitting First 10 words in a string

2010-11-02 Thread steven mosher
Line should be:

first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
sent))

sorry cut and past error

On Tue, Nov 2, 2010 at 3:32 PM, steven mosher wrote:

>  That's easy you are confusing the dummy code I sent.
>
>  Do this:
>
>  lit<-read.csv("litologija.csv", sep=";", dec=".")
> sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>
> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
> sent)
>
> I put the length of the vector to 10 just to do a dummy problem.
>
> Then do this:
>
> for(j in 1:nrow(sent) {
>
>   sent[j,2:11]<-strsplit(sent[j,1]," ")[[1]][1:10]
>
> }
>
>
> That will get you a result the crude brute force way.
>
> try that.
>
> Then you can learn sapply way. but first you need to learn R data
> structures.
>
>
>
>
>
> On Tue, Nov 2, 2010 at 1:47 PM, Matevž Pavlič 
> wrote:
>
>> Hi Steven,
>>
>>
>>
>> Thank you for the help. I get an error though when i do this :
>>
>>
>>
>> >lit<-read.csv("litologija.csv", sep=";", dec=".")
>>
>> >sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>>
>> >str(sent)
>>
>> >sentV<-rep(sent,10)
>>
>> >str(sentV)
>>
>>
>>
>>
>> >first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>>
>> >DF
>> <-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>>
>>
>>
>> »Error in data.frame(Sentence = sent, first, second, third, fourth,
>> fifth,  :
>>
>> arguments imply differing number of rows: 22928, 10«
>>
>>
>>
>> What am I doing wrong?
>>
>>
>>
>> Thnks, m
>>
>>
>>
>>
>>
>>
>>
>> *From:* steven mosher [mailto:mosherste...@gmail.com]
>> *Sent:* Tuesday, November 02, 2010 8:45 PM
>> *To:* David Winsemius
>> *Cc:* Matevž Pavlič; Gaj Vidmar; r-h...@stat.math.ethz.ch
>> *Subject:* Re: [R] spliting first 10 words in a string
>>
>>
>>
>>  Thanks david.
>>
>>
>>
>>   Matevz, maybe I can help explain by doing a very simple and brute force
>> approach
>>
>> as opposed to  the way david did it. But you should learn his methods.
>>
>>
>>
>> I will just do a subset of your problem and if you understand how it works
>> then you should
>>
>> be able to get something done and then make it more elegant.
>>
>>
>>
>> First, I simplify the problem by separating out the "sentence" column.
>>
>>
>>
>> You can do this with your data frame by simply doing this
>>
>>
>>
>> MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)
>>
>>
>>
>> so I take your original data.frame (yourbigDF) and I just create a copy of
>> that one column
>>
>>  $Opis
>>
>>
>>
>> Later we can merge the two back together after I add 10 columns for the
>> words
>>
>>
>>
>>
>>
>> Lets make some dummy data with just 10 rows
>>
>>
>>
>>
>>
>>
>>
>>  sentence<- "this is a sentence with ten words or maybe more than ten
>> words"
>>
>>  sentV<-rep(sentence,10)
>>
>> # now I just made 10 rows of the same sentence
>>
>> # NEXT because I am going to create 10 new colums of 10 rows I create
>>
>> # 10 vectors> each is named and each has 10 elements For the rows.
>>
>> # they have NO DATA in them
>>
>>
>>
>>
>>  
>> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>>
>>
>>
>> #Next I create a dataframe with Sentence in the first column and 10 blank
>> colums.
>>
>> # NOTE I use stringsAsFactors=False
>>
>>
>>
>>  DF
>> <-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>>
>>
>>
>> # This is what it would look like ( the first row)
>>
>> DF[1,]
>>
>>
>>
>> Sentence first second third fourth fifth sixth seventh eighth ninth tenth
>>
>> 1 this is a sentence with ten words or maybe more than ten words FALSE
>>  FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE
>>
>>
>>
>> Next, I will show you how to assign the first ten words to the 10 blank
>> columns
>>
>>
>>
>> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>>
>>
>>
>> #DF[1,2:11]  selects the columns 2-11 of the first row
>>
>> #strsplit  returns the first 10 words [1:10] and place them in the
>> columsn2-11
>>
>>
>>
>> If you want to do this the slow way you can just loop through your
>> dataframe row by row
>>
>> or you can probably use apply.
>>
>>
>>
>> Make more sense?
>>
>> > DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>>
>> > DF[1,]
>>
>> Sentence first
>> second third   fourth fifth sixth seventh eighth ninth tenth
>>
>> 1 this is a sentence with ten words or maybe more than ten words  this
>> is a sentence  with   ten   words or maybe  more
>>
>> > DF[1,"first"]
>>
>> [1] "this"
>>
>>
>>
>> On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius 
>> wrote:
>>
>>
>> On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:
>>
>> Hi all,
>>
>> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
>> :().
>>
>> The last solution from David is also freat i just don't undestand why R
>>  put the w

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Dennis Murphy
Hi:

I don't know why, but it seems that in

bwplot(voice.part ~ height, data = singer,
main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

the assignment of colors is offset by 3:

Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1
fillcol <- c("yellow","blue","green","red","pink","violet","brown","gold")

In the above plot,

yellow -> Bass 2  (1)
blue -> Tenor 1 (4)
green -> Soprano 2  (7)
red -> Bass 1 (10 mod 8 = 2)
pink -> Alto 2 (13 mod 8 = 5)
etc.

It's certainly curious.

Dennis


On Tue, Nov 2, 2010 at 2:51 PM, Rainer Hurling  wrote:

> On 02.11.2010 22:37 (UTC+1), David Winsemius wrote:
>
>>
>> On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:
>>
>>  On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:
>>>

 On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

 snipped quite a bit of talking past each otther

>
> Of course your example with eight colours works, too. But as you can
> see in the plot, the colours have different order then in the vector
> 'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
> "bisque1", the second box (bass1) "blue4" and so on.
>

 Oh. Try putting the fill argument outside the panel and see if the panel
 handles it in the manner you expect:

 bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
 outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
 'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
 panel = function(...) {
 panel.grid(v = -1, h = 0)
 panel.bwplot( ...)
 })
 bp3


> I hope, this explaination is a bit clearer than my preceding ones.
>

 And I hope my suggestion now "works".

>>>
>>> Thank you for the hint, that it works also outside of the panel. It
>>> looks like I missed the wood for trees here ;-)
>>>
>>> In your latest, special case the colours work. After having a nearer
>>> look at it I found that your colour vector has length 10 (2:11), and
>>> only the first eight colours are filled in the boxes.
>>>
>>
>> I don't know why the ordering only is irregularly preserved ...
>> apparently in situations where the number of colors is a multiple of 5.
>> Perhaps a question that Sarkar, Andrews or Ehlers can answer. I looked
>> at the code for bwplot and it uses panel.polygon for drawing the
>> rectangles. The colors and other graphical parameters are supposed to be
>> picked up from the box.rectangle settings in par.settings. (Trying to
>> set those alos failed.) I also looked at panel.polygon and do not see a
>> reason for the shuffling of colors.
>>
>
> I also hope that someone from 'inner circle' would have a look ;-)
>
>
>  Wrong order also:
>>  > bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
>> outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
>> 'cyan2' 'darkgray' 'darkorange", par.settings =
>> list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,
>> + panel = function(...) {
>> + panel.grid(v = -1, h = 0)
>> + panel.bwplot( ...)
>> + })
>>  > bp3
>>
>
> Yes, I tried to manipulate box.rectangle myself with also no success. I
> think, the design of panel.bwplot originally allows only for using one fill
> color (just a guess).
>
>
>  This seems to be reproducable:
>>>
>>> ### NOT WORKING: 8 colours in the not in order of given vector
>>> bwplot(voice.part ~ height, data = singer,
>>> main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
>>> 'pink' 'violet' 'brown' 'gold'",
>>> fill=c("yellow","blue","green","red","pink","violet","brown","gold"))
>>>
>>> ### WORKING: 10 (8+2*NA) colours in order of given vector
>>> bwplot(voice.part ~ height, data = singer,
>>> main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink'
>>> 'violet' 'brown' 'gold'",
>>> fill=c("yellow","blue","green","red","pink","violet","brown","gold",
>>> NA, NA))
>>>
>>> I really do not understand what is going on here,
>>>
>>
>> Me either.
>>
>
> Thank you so far. I am afraid I have to go to bed. In just a few hours I
> have to work for my employer again ...
>
>
>  Rainer
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Display of NAs in character columns of a data frame under fix() or edit().

2010-11-02 Thread Peter Dalgaard
On 11/02/2010 09:45 PM, Rolf Turner wrote:
> 
> Example:
> 
>   xxx <- data.frame(x=1:26,y=letters)
>   xxx$x[c(2,4,6,8)] <- NA
>   xxx$y[c(1,3,5,7)] <- NA
> 
>   yyy <- edit(yyy)
> 
> The missing values in xxx$y appear as blanks in the spreadsheet window that
> appears, whereas the missing values in the numeric column "x" appear as "NA"
> (as I would expect).
> 
> Is this a bug or a feature?

Probably feature, How would you enter abbreviations for North America,
Noradrenaline, Neil Adams, etc...? On the other hand, it is currently
impossible to make a field blank.

Actually, the whole edit() interface is a bit of a long-standing bug.
It's been with us "forever" (as far as I remember, the spreadsheet
interface actually predates data frames in R). It was constructed using
very basic GUI elements on Windows and X11, and it never _quite_ did
what you'd want it to do.

Ideas about how to do better seem to have gotten stuck in indecision
about which graphical toolkit to use. The Rcmdr has a data viewer (but
not editor) written with the Tcl/Tk interface, which might be a starting
point.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >