[R] generate XML

2010-01-07 Thread Robert
Finally, I managed to automatically generate rather simple XML code. In R this 
is a matrix. The problem concerns how to write that matrix to file. I use 
write.table and the extension of file .xml Everything works perfect beside 
coding some language letters (Polish). But the problem disappears when I change 
the extension to .txt Therefore I write a file with .txt extension and after 
that I manually change the extension to .xml  - it does not solve the problem. 
Is it possible to write my matrix or txt file to valid xml file? I need to do 
it automatically.

Best,
Robert  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Font settings in xfig

2008-05-15 Thread Robert
Hello

I'm using the xfig-function in R to export figures in fig-format. To
use these exported figures in LaTeX, I first run a fig2dev to get a
pstex and pstex_t file. However, in order to get the right pstex_t
file (that is, with the text of the original figure) I have to change
the font and special text variables in the fig files.
I would like R to do this font changing job, but I can't figure out
how by reading the help files.

Does anybody know how to do accomplish this?

Best regards

Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Font settings in xfig

2008-05-15 Thread Robert
If I edit the fig file with Xfig, I have to change the value of
"special flag" to "special" and the font to a LaTeX font instead of a
postscript font.
If these changes are not made, the pstex_t file produced from fig2dev
doesn't contain the text as it should. So if I input the pstex_t in a
tex file, the text in the figure is not in the same font as the rest
of the document.

I'm exporting figures like this:

xfig(file="test.fig", width=5, height=4, onefile=TRUE)
plot(data,xlab="Something", ylab="Some other thing")
dev.off()

The xfig command does have an argument called encoding, but I can't
figure out if this can be used to make the changes I want. In the par
command there's an argument called family, which is used to specify
the font family in figures, but I don't know how changes this font to
a LaTeX font.

Thanks for your help!

Robert

On 15 Maj, 17:50, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Thu, 15 May 2008, Robert wrote:
> > Hello
>
> > I'm using the xfig-function in R to export figures in fig-format. To
> > use these exported figures in LaTeX, I first run a fig2dev to get a
> > pstex and pstex_t file. However, in order to get the right pstex_t
> > file (that is, with the text of the original figure) I have to change
> > the font and special text variables in the fig files.
> > I would like R to do this font changing job, but I can't figure out
> > how by reading the help files.
>
> > Does anybody know how to do accomplish this?
>
> Unlikely without knowing what changes you want made and why 
>
>
>
> > Best regards
>
> > Robert
>
> > __
> > [EMAIL PROTECTED] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
> __
> [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Font settings in xfig

2008-05-19 Thread Robert
The main goal is to include graphics from R in a LaTeX document.
Thanks a lot for all your help - its working now.

/Robert

On 16 Maj, 17:31, "Greg Snow" <[EMAIL PROTECTED]> wrote:
> If the main goal is to include graphs created by R in LaTeX documents and 
> have them look nice, and usingxfigwas just one way of attempting this, then 
> here are a couple other options that may or may not work better.
>
> Use the postscript graphics device and use psfrag in LaTeX to replace text in 
> plot with LaTeX commands.  This is fine if you are using postscript and only 
> have a couple of things that need to be replaced.  This can be a pain if you 
> want to replace every tick mark label with the current font in the document, 
> or if you want to go directly to pdf without going through postscript.
>
> Generate pdf/eps files of the graphs using the same font as your LaTeX 
> document (see R-news article (6)2 41-47, on ways to specify the font).  This 
> changes the font to match, but does not do arbitrary LaTeX commands.
>
> Create an eps file, then use eps2pgf 
> (http://sourceforge.net/projects/eps2pgf/) to convert to a pgf file to be 
> included in your LaTeX file (via \input{}).  You need to use the pgf package 
> in your LaTeX file, but then all the graphics are done internally using by 
> default the same fonts as the rest of the document.  You can also do psfrag 
> like replacements when converting the file to include LaTeX commands.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [EMAIL PROTECTED]
> (801) 408-8111
>
>
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Scionforbai
> > Sent: Friday, May 16, 2008 6:57 AM
> > To: Kevin E. Thorpe
> > Cc: [EMAIL PROTECTED]; Robert
> > Subject: Re: [R] Font settings inxfig
>
> > > Is there a reason you are going through this route to get
> > figures into
> > > LaTeX instead of using postscript (or PDF for pdflatex)?
>
> > To have LaTeX-formatted text printed onto your pdf figures,
> > to include in LaTeX documents.
>
> > R cannot output 'special' text inxfig. You need to
> > post-process the .fig file, according to the fig format
> > (http://www.xfig.org/userman/fig-format.html), replacing, on
> > lines starting with '4', the correct values for font and
> > font_flags. Using awk (assuming you work on Linux) this is
> > straightforward :
>
> > awk '$1==4{$6=0;$9=2}{print}' R_FILE.fig > OUT.fig
>
> > Then, in order to obtain a pdf figure with LaTeX-formatted
> > text, you need a simple driver.tex:
>
> > driver.tex :
>
> > \documentclass{article}
> > \usepackage{epsfig}
> > \usepackage{color} %(note: you might not might not need to do
> > this) \begin{document} \pagestyle{empty} \input{FILE.pstex_t}
> > \end{document}
>
> > Now you can go through the compilation:
>
> > fig2dev -L pstex OUT.fig > OUT.pstex
> > fig2dev -L pstex_t -p OUT.pstex OUT.fig > OUT.pstex_t sed
> > s/FILE/"OUT"/ driver.tex > ./OUT.tex latex OUT.tex dvips -E
> > OUT.dvi -o OUT.eps epstopdf OUT.eps
>
> > Of course you need R to write the correct Latex math strings
> > (like $\sigma^2$).
> > Hope this helps,
>
> > scionforbai
>
> > __
> > [EMAIL PROTECTED] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] min of array

2008-11-14 Thread Robert
I wanted to get the minimum of an array of dimenstion (nr,nc,6), so
the output was of dimension (nr,nc), ie for each row,column the
minimum of 6 numbers.  I thought apply might be the sensible way, but
I also compared it to looping over each row/column.  In this case the
loop method was faster than my apply (which probably means I wasn't
using apply in the best way).  Then I found that pmin was much faster. 
Here is some example code:

nr <- 300
nc <- 300

aa <- array(rnorm(nr*nc*6),c(nr,nr,6))

system.time(aamin1 <- apply(aa,c(1,2),min))

system.time({
aamin2 <- matrix(0,nr,nc)
for(i in 1:nr)
  for(j in 1:nc)
  aamin2[i,j] <- min(aa[i,j,])
})

system.time(aamin3 <-
pmin(aa[,,1],aa[,,2],aa[,,3],aa[,,4],aa[,,5],aa[,,6]))

I have three questions:
1) is there a better way to use apply than the approach I have taken
2) How can I write a function to use pmin on an array as above, but
when I don't know the size of the array? ie
3) My applications are from images, so it would be good for me to know
how to go about writing efficient code that does this sort of
operation on arrays, but using more complex functions than pmin/pmax.
Would the sensible way to go be to write the function in C and call it
that way?  Hopefully an answer to (1) will answer this too.

Thanks
I'm using R-2.5.1 (out of date I know, but I have tried this at work
using 2.8.0 as well), on linux.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

2017-11-21 Thread Robert Wilkins
How difficult is it to get a good feel for the internals of R, if you want
to learn the general code base, but also the CPU intensive stuff ( much of
it in C or Fortran?) and the ways in which the general code and the CPU
intensive stuff is connected together?

R has a very large audience, but my understanding is that only a small
group have a good understanding of the internals (and some of those will
eventually move on to something else in their career, or retire
altogether).

While I'm at it, a second question: 15 years ago, nobody would ever offer a
job based on R skills ( SAS, yes, SPSS, maybe, but R skills, year after
year, did not imply job offers). How much has that changed, both for R and
for NumPy/Pandas/SciPy ?

thanks in advance

Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data cleaning & Data preparation, what do R users want?

2017-11-29 Thread Robert Wilkins
R has a very wide audience, clinical research, astronomy, psychology, and
so on and so on.
I would consider data analysis work to be three stages: data preparation,
statistical analysis, and producing the report.
This regards the process of getting the data ready for analysis and
reporting, sometimes called "data cleaning" or "data munging" or "data
wrangling".

So as regards tools for data preparation, speaking to the highly diverse
audience mentioned, here is my question:

What do you want?
Or are you already quite happy with the range of tools that is currently
before you?

[BTW,  I posed the same question last week to the r-devel list, and was
advised that r-help might be a more suitable audience by one of the
moderators.]

Robert Wilkins

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data cleaning & Data preparation, what do R users want?

2017-11-29 Thread Robert Wilkins
Christopher,

OK, well what about a range of functions in an R package that
automatically, with very little syntax, pulls in data from a variety of
formats (CSV, SQLite, and so on) and converts them to an R data frame. You
seem to be pointing to something like that.
Something like that, in some form or another, probably already exists,
though it might be either imperfect (not as user-friendly as possible) or
not well publicised, or both.
Or another tangent: your co-workers are not going to stop using Excel,
whether you like it or not, and many end-users are stuck in the exact same
position as you (co-workers who deliver the data in Excel). I will guess
that data stored in Excel tends to be dirty in somewhat predictable ways.
(And again, those other end-user's coworkers are not going to change their
behaviour). And so: a data munging tool that makes it as easy as possible
to clean up the data in Excel spreadsheets and export them to R data
frames. One prerequisite: an understanding of what tends to go wrong with
data with Excel ( the data in Excel tends to be dirty, but dirty in what
way?).

Thank you for your response Christopher. What state are you in?


On Wed, Nov 29, 2017 at 11:52 AM, Christopher W. Ryan 
wrote:

> Great question. What do I want? I want my co-workers to stop using Excel
> spreadsheets for data entry, storage, and sharing! I want them to
> understand the value of data discipline. But alas . . . .
>
> I work in a county health department in the US. Between dplyr, stringr,
> grep, grepl, and the base R read() functions, I'm doing OK.
>
> I need to learn more about APIs, so I can see if I can make R directly
> grab data from, e.g. our state health department sources. My biggest
> hassle is having to download a data file, save it somewhere, and then
> open R and read it in. I'd like to be able to do it all in R. Would make
> the generation of recurring reports easier.
>
> --Chris Ryan
>
> Robert Wilkins wrote:
> > R has a very wide audience, clinical research, astronomy, psychology, and
> > so on and so on.
> > I would consider data analysis work to be three stages: data preparation,
> > statistical analysis, and producing the report.
> > This regards the process of getting the data ready for analysis and
> > reporting, sometimes called "data cleaning" or "data munging" or "data
> > wrangling".
> >
> > So as regards tools for data preparation, speaking to the highly diverse
> > audience mentioned, here is my question:
> >
> > What do you want?
> > Or are you already quite happy with the range of tools that is currently
> > before you?
> >
> > [BTW,  I posed the same question last week to the r-devel list, and was
> > advised that r-help might be a more suitable audience by one of the
> > moderators.]
> >
> > Robert Wilkins
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data cleaning & Data preparation, what do R users want?

2017-12-11 Thread Robert Wilkins
Dominik (and others)

If it is indeed still the biggest paint point, even in 2017, then maybe we
can do something about that, with more efforts at different user interface
design and try-outs with them on specialized datasets.
[ The fact that in some specialties, such as clinical trials, for example,
getting access to public domain datasets (and not having to use a tiny
"toy" dataset, which nobody will pay attention to, does make it harder].

It would help if academia (both comp-sci and statistics departments) would
support those who invest resources in drafting and test-driving new product
designs. If, in the year 2017, it is still a big pain point, doesn't that
make sense. More speculative work in statistical programming language
design has not been a priority in academia since before 1980.

On Thu, Nov 30, 2017 at 4:11 AM, Dominik Schneider <
dominik.schnei...@colorado.edu> wrote:

> I would agree that getting data into R from various sources is the biggest
> pain point. Even if there is an api, the results are not always consistent
> and you have to do lots of dimension checking to get it right. Or there
> isn't an open api at all and you have to hack it by web scraping or
> otherwise- http://enpiar.com/2017/08/11/one-hour-package/
>
> On Thu, Nov 30, 2017 at 1:00 AM, Jim Lemon  wrote:
>
>> Hi again,
>> Typo in the last email. Should read "about 40 standard deviations".
>>
>> Jim
>>
>> On Thu, Nov 30, 2017 at 10:54 AM, Jim Lemon  wrote:
>> > Hi Robert,
>> > People want different levels of automation in the software they use.
>> > What concerns many of us is the desire for the function
>> > "figure-out-what-this-data-is-import-it-and-get-rid-of-bad-values".
>> > Such users typically want something that justifies its use by being
>> > written by someone who seems to know what they're doing and lots of
>> > other people use it. One advantage of many R functions is their
>> > modular construction. This encourages users to at least consider the
>> > steps that are taken rather than just accept what comes out of that
>> > long tube.
>> >
>> > Take the contentious problem of outlier identification. If I just let
>> > the black box peel off some values, I don't know what I have lost. On
>> > the other hand, if I import data and examine it with a summary
>> > function, I may find that one woman has a height of 5.2 meters. I can
>> > range check by looking up the Guinness Book of Records. It's an
>> > outlier. I can estimate the probability of such a height.  Hmm, about
>> > 4 standard deviations above the mean. It's an outlier. I can attempt a
>> > Sherlock Holmes. "Watson, I conclude that an imperial measure (5'2")
>> > has been recorded as a metric value". It's not an outlier.
>> >
>> > The more R gravitates toward "black box" functions, the more some
>> > users are encouraged to let them do the work.You pays your money and
>> > you takes your chances.
>> >
>> > Jim
>> >
>> >
>> > On Thu, Nov 30, 2017 at 3:37 AM, Robert Wilkins 
>> wrote:
>> >> R has a very wide audience, clinical research, astronomy, psychology,
>> and
>> >> so on and so on.
>> >> I would consider data analysis work to be three stages: data
>> preparation,
>> >> statistical analysis, and producing the report.
>> >> This regards the process of getting the data ready for analysis and
>> >> reporting, sometimes called "data cleaning" or "data munging" or "data
>> >> wrangling".
>> >>
>> >> So as regards tools for data preparation, speaking to the highly
>> diverse
>> >> audience mentioned, here is my question:
>> >>
>> >> What do you want?
>> >> Or are you already quite happy with the range of tools that is
>> currently
>> >> before you?
>> >>
>> >> [BTW,  I posed the same question last week to the r-devel list, and was
>> >> advised that r-help might be a more suitable audience by one of the
>> >> moderators.]
>> >>
>> >> Robert Wilkins
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Facing problem in installing the package named "methyAnalysis"

2017-12-29 Thread Robert Baer

Bioconductor help is here:

https://www.bioconductor.org/help/



On 12/29/2017 6:00 AM, Pijush Das wrote:

Thank you Michael Dewey.
Can you please send me the email id for Bioconductor.




regards
Pijush

On Fri, Dec 29, 2017 at 5:20 PM, Michael Dewey 
wrote:


Dear Pijush

You might do better to ask on the Bioconductor list as IRanges does not
seem to be on CRAN so I deduce it is a Bioconductor package too.

Michael


On 29/12/2017 07:29, Pijush Das wrote:


Dear Sir,




I have been using R for a long time. But recently I have faced a problem
when installing the Bioconductor package named "methyAnalysis". Firstly it
was require to update my older R (R version 3.4.3 (2017-11-30)) in to
newer
version. That time I have also updated the RStudio software.

After that when I have tried to install the package named "methyAnalysis".
It shows some error given below.

No methods found in package ‘IRanges’ for requests: ‘%in%’,
‘elementLengths’, ‘elementMetadata’, ‘ifelse’, ‘queryHits’, ‘Rle’,
‘subjectHits’, ‘t’ when loading ‘bumphunter’
Error: package or namespace load failed for ‘methyAnalysis’:
   objects ‘.__T__split:base’, ‘split’ are not exported by
'namespace:IRanges'
In addition: Warning message:
replacing previous import ‘BiocGenerics::image’ by ‘graphics::image’ when
loading ‘methylumi’

I also try to install the package after downloading the source package
from
Bioconductor but the method is useless.

Please help me to install the package named "methyAnalysis".

Thanking you



regards
Pijush

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posti
ng-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Clinical Trial data sets in public domain?

2018-01-13 Thread Robert Wilkins
Is anybody using R to do analysis of clinical trial datasets that have been
put in the public domain (which are super hard to find). Not only a single
data table, but the actual database, with a handful of data tables with
one-to-one or many-to-one relationships?

[ For example, "Adverse Events" and "Patient Info" are two datasets with a
many-to-one relationship, the "Patient Info" dataset has precisely one row
for each patient who received a dose of study drug.]

Robert Wilkins

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fortune candidate

2018-01-29 Thread Robert Baer

On 1/27/2018 12:16 PM, David Winsemius wrote:

John (to a serial querulant):

 ...but with such a sweeping lack of
information from you, don't congratulate yourself if you get a helpful
answer.  It wasn't your fault.


David Winsemius
Alameda, CA, USA

Second that nomination!



'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] update.packages() error R 3.4.0

2017-04-28 Thread Robert Baer
Is there an easy work-around for the update.packages error I'm getting 
on Windows 10 with R 3.4.0?


> update.packages()
--- Please select a CRAN mirror for use in this session ---
foreign :
 Version 0.8-67 installed in C:/Program Files/R/R-3.4.0/library
 Version 0.8-68 available at https://mirror.las.iastate.edu/CRAN
Update (y/N/c)?  y
Warning in install.packages(update[instlib == l, "Package"], l, repos = 
repos,  :

  'lib = "C:/Program Files/R/R-3.4.0/library"' is not writable
Error in if (file.exists(dest) && file.mtime(dest) > file.mtime(lib) &&  :
  missing value where TRUE/FALSE needed

--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] display double dot over character in plotmath?

2017-05-14 Thread Robert Baer
I got this but the spacing is all wrong and plotmath() seems to have no 
way to do kernning or overprinting.  I'm surprised Paul didn't 
generalize the hat()-type functionality.


ggplot(data, aes(x=X)) + geom_line(aes(y = Z), size=0.43) +
  xlab(expression(atop("\U0308",Omega)))

ggplot(data, aes(x=X)) + geom_line(aes(y = Z), size=0.43) +
  xlab(expression(atop("\U0308",omega)))


On 5/14/2017 11:18 AM, Ranjan Maitra wrote:

On Sun, 14 May 2017 09:08:46 -0700 David Winsemius  
wrote:


On May 14, 2017, at 8:43 AM, Ranjan Maitra  wrote:

Thanks, Duncan!

This works for the particular case and is, to my mind, a great solution!

However, I was wondering: is it possible to use these double dots with another 
character, such as omega?

I apologize for changing the question somewhat, but I did not realize earlier 
that there were separate codes for putting double dots over different letters 
and I thought that figuring out the simpler question would be enough for me to 
figure out the next step.

I think you should be looking for a LaTeX solution. There is a 
tikzDevice-package.

This says you can assemble symbols with backspaces:

https://www.stat.berkeley.edu/~partha/symbols.pdf

For instance, LATEX defines \hbar (“~”) as a “¯” character (\mathchar’26) 
followed by a backspace of 9 math units (\mkern-9mu), followed by the letter 
“h”:

The second example in ?tikz, which could be a starting point for completing 
your task fails on my Mac by only displaying the names of the glyphs but not 
the glyphs themselves in the plot,  but it might have a better chance of 
succeeding on a Linux box.


Thanks! I was trying to avoid using tikz but I guess that there may well be no 
other alternative.

Best wishes,
Ranjan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Cannot generate a *.docx file

2017-05-14 Thread Robert Baer
I don't know what the error is, but your code snippet worked fine for me 
on Windows 10, R 3.4.0-patched.


I noticed that rJava is a dependency.   Don't know that the patch or 
Java updates I installed today could be a difference, but you might 
update packages, patched version,  Java, etc  and try again [since it 
worked here pasted right from your example].



On 5/14/2017 2:30 PM, Yves S. Garret wrote:

I'm using R 3.4.0.

-- Forwarded message --
From: Yves S. Garret 
Date: Sun, May 14, 2017 at 2:35 PM
Subject: Cannot generate a *.docx file
To: r-help 


Hello,

I have the following code example:

library(ReporteRs)

# Create a word document to contain R outputs
doc <- docx()

# Add a title to the document
doc <- addTitle(doc, "Simple Word document", level = 1)

# Add a paragraph of text into the Word document
cat("Output 1\n")
doc <- addParagraph(doc, "This.")
cat("Output 2\n")

# Write the Word document to a file
writeDoc(doc, file = "r-reporters-simple-word-document.docx")

When I run it, this is what I see:

source("writing_to_ms_word_new.R")
Output 1
Error in UseMethod("addParagraph") :
   no applicable method for 'addParagraph' applied to an object of class
"docx"

Why?  The library loads as it should.  So why am I getting the above error?

Thanks in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-SIG-Finance] getting a subset corresponding to a list element

2017-05-26 Thread Robert Harlow
Hi Michael,
  Try not to post twice - this is really more of a general R question.  To
answer the question, however, turn each element of your resultlist into an
xts (or zoo)  object so that you have a list of xts objects (called xtsList
for example.)  Then call do.call("merge", xtsList).  Also, your example is
tough because it requires access to bloomberg, which isn't necessarily the
case for the vast majority of R users.
Bob

On Fri, May 26, 2017 at 4:58 PM, Michael Ashton <
m.ash...@enduringinvestments.com> wrote:

> I'm not sure how to ask this with the proper terminology, but here goes:
>
> The BDH() function in RBLPAPI returns, for a list of symbols (e.g., 'SPX
> Index','RIY Index','IBM Equity') a list of closing prices. The problem is
> that the result is not a matrix or a dataframe, but a list.
>
> So, if I run the query with 3 symbols, I get a list with 3 elements. For
> example, in this case, if
>
> symbolist <-c("SPX Index","MXWO Index","MXEA Index")
> resultlist <- bdh(symbollist, "PX_LAST", options=opt,start.date=as.
> Date(begdate))
>
> then resultlist is a list with 3 elements, and as many rows as there are
> dates between "begdate" and today (or as many month-ends, if "opt" declares
> monthly periodicity). Suppose in this case I've set this up to retrieve 60
> dates.
>
> But I don't WANT a list. I want a zoo object containing each of these as
> an element. I thought about starting by trying to put each element in a
> matrix by
>
> data<-matrix(nrow=60,ncol=length(symbollist))
>
> and then looping through from 1 to length(symbolist), letting
>
> data[,i] <- resultlist$symbollist[i][,2]
>
> but this clearly doesn't work since what I really want is
>
> data[,1] <-resultlist$'SPX Index'[,2]
> data[,2] <-resultlist$'MXWO Index'[,2]
> etc
>
> But there's probably a much easier way to do this.
>
> I am sending this to both the general help list and the r-sig-finance list
> since there is probably both a general way to stuff a list into a zoo
> object and a way to do it cleanly with the BDH() command. Thanks in advance
> for help.
>
> Mike
>
>
> [[alternative HTML version deleted]]
>
> ___
> r-sig-fina...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] finding components of an API

2017-05-28 Thread Robert Sherry

Erin,

I do not think there is an R package that will enable you to get the 
data you would like from spotcrime.com.


You could write code, in R, or some other language, to extract the data 
you want but that is going to be a changeling  task and if
the website changes its format then your code may suddenly stop working. 
Also, the people who run spotcrime.com may not be happy if you do so.


Bob

On 5/28/2017 4:45 PM, Erin Hodgess wrote:

Sorry!!!

It's spotcrime.com, and I would like to use it via R for crimes.



On Sun, May 28, 2017 at 2:19 PM, Jeff Newmiller 
wrote:


Can you please be just a little less vague? What API are you talking
about, and how is this related to R?
--
Sent from my phone. Please excuse my brevity.

On May 28, 2017 11:48:42 AM PDT, Erin Hodgess 
wrote:

Hello!

I would like to use a particular API for crimes (spot crimes) but I
can't
find what components go into the API.  I have gone into the website,but
to
no avail

Has anyone used it please?

Thank you,
Sincerely,
Erin





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help for Netbeans R plugin development

2017-05-29 Thread Robert Baer

rJava and Rserve might be architectures of interest.


On 5/28/2017 1:12 PM, Peter Cheung wrote:

Hi
My name is Peter, developing R plugin for netbeans, it is entirely in Java. 
What is the best way to interact Java with R and how can I hook some R 
functions such as plot()? so everytime plot() is called and i can capture the 
generated graph.
thanks
from Peter
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creat contingency tables with fixed row and column margins

2017-05-29 Thread Robert Baer

getAnywhere(fisher.test) probably has some clues


On 5/27/2017 2:49 PM, li li wrote:

Hi all,
   Is there an R function that can be used to enumerate all the contingency
tables with fixed row and column margins. For example, can we list all 3 by
3 tables with row margins 3,6,6 and column margins 5,5,5.
Thanks very much!
Hanna

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Estimating Unbiased Standard Deviation with Autocorrelation

2017-06-15 Thread Robert McGehee
Hello,
I have a vector of values with significant autocorrelation, and I want to 
calculate an unbiased standard deviation that adjusts for the autocorrelation. 
The formula linked below purports to provide what I want:

https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Effect_of_autocorrelation_.28serial_correlation.29

However, rather than just implementing this equation in my own function, I 
figured there is likely already an R function that does this, and perhaps does 
a better job of handling the subtleties of the adjustment when the ACF itself 
is estimated from the same data that is used to estimate the sample standard 
deviation (if there are any).
 
If such a function exists, can anyone point me to it?

Thanks in advance,
Robert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MODISTools Help

2017-06-23 Thread Robert Baer
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R memory limits on table(x, y) (and bigtabulate)

2017-07-03 Thread Robert Zimbardo
I have two character vectors x and y that have the following characteristics:

length(x)  # same as
length(y) # 872099

length(unique(x))  # 47740
length(unique(y)) # 52478

I need to crosstabulate them, which would lead to a table with

47740*52478 # 2505299720

cells, which is more than

2^31 # 2147483648

cells, which seems to be R's limit because I am getting the error message

Error in table(x, y) : attempt to make a table with >= 2^31 elements

Two questions:

- is this really R's limit, even on a 64bit machine? It seems like it
(given 

and , but
I just want to make sure I understood that right);
- I thought I could handle this with the package bigtabulate, but whenever I run

xy.tab <- bigtable(data.frame(x, y), ccols=1:2)

R crashes as follows:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Any idea on what I am doing wrong with bigtabulate? Thanks for your
consideration

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] unable to collate and parse R files for package ‘colorspace’

2017-07-05 Thread Kabacoff, Robert
When attempting to install the �colorspace� package on RedHat Linux I get the 
following error. Any help would be appreciated.

Rob

Rob Kabacoff, Ph.D.
Professor, Quantitative Analysis Center
Wesleyan University


> install.packages("colorspace")
Installing package into �/home/rkabacoff/R/x86_64-redhat-linux-gnu-library/3.3�
(as �lib� is unspecified)

trying URL 'https://cran.mtu.edu/src/contrib/colorspace_1.3-2.tar.gz'
Content type 'application/x-gzip' length 293433 bytes (286 KB)
==
downloaded 286 KB

* installing *source* package �colorspace� ...
** package �colorspace� successfully unpacked and MD5 sums checked
** libs
gcc -m64 -std=gnu99 -I/usr/include/R -DNDEBUG  -I/usr/local/include-fpic  
-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic -fpic -fPIC   -c colorspace.c -o 
colorspace.o
colorspace.c:589: warning: �CheckGamma� defined but not used
gcc -m64 -std=gnu99 -shared -L/usr/lib64/R/lib -o colorspace.so colorspace.o 
-L/usr/lib64/R/lib -lR
installing to 
/home/rkabacoff/R/x86_64-redhat-linux-gnu-library/3.3/colorspace/libs
** R
Error in parse(outFile) :
  
/home/rkabacoff/R/x86_64-redhat-linux-gnu-library/3.3/colorspace/R/colorspace:2:1:
 unexpected $end
ERROR: unable to collate and parse R files for package �colorspace�
* removing �/home/rkabacoff/R/x86_64-redhat-linux-gnu-library/3.3/colorspace�


> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Oracle Linux Server 6.8

locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8   LC_NAME=C
[9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.3.2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I make the legend in ggplot2 the same height as my plot?

2017-07-06 Thread Robert Baer

Don't know what your data looks like, but you might  try:

p <-  Scenario1+guides(fill = guide_colorbar(bar width = 1.5, barheight 
= unit(10, "mm")))


print(p)


On 7/5/2017 5:13 PM, Kristi Glover wrote:

Hi R Users,

I tried to increase the legend height in ggplot2, but it did not respond at all 
using the follwoing code. Do you have any suggestions for me?



dat<-data.frame(temperature)

P1<-ggplot(dat, aes(X, Y))

Scenario1<-P1+geom_point(aes(colour = value), size = 1)+ theme_bw()+ 
theme(axis.text.x = element_blank(),axis.text.y = element_blank())

Scenario1<-Scenario1+facet_wrap(~variable, 
ncol=2)+scale_color_gradientn(colours = rainbow(48))

Scenario1+guides(fill = guide_colorbar(bar width = 1.5, barheight = unit(10, 
"mm")))


Thanks,

KG


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting sentences with combinations of target words/terms from cancer patient text medical records

2017-07-12 Thread Robert McGehee
Hi Paul,
Sounds like you have your answer, but for fun I thought I'd try solving your 
problem using only a regular expression query and base R. I believe this works:
 
> txt <- "Patient had stage IV breast cancer. Nothing matches this sentence. 
> Metastatic and breast match this sentence. French bike champion takes stage 
> IV victory in Tour de France."

> pattern <- "([^.?!]*(?=[^.?!]*\\bbreast\\b)(?=[^.?!]*\\b(metastatic|stage 
> IV)\\b)(?=[\\s.?!])[^.?!]*[.?!])"

> regmatches(txt, gregexpr(pattern, txt, perl=TRUE, ignore.case=TRUE))[[1]]
[1] "Patient had stage IV breast cancer."
[2] " Metastatic and breast match this sentence."

Cheers, Robert

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Paul Miller via 
R-help
Sent: Wednesday, July 12, 2017 8:49 AM
To: Bert Gunter 
Cc: R-help 
Subject: Re: [R] Extracting sentences with combinations of target words/terms 
from cancer patient text medical records

Hi Bert,

Thanks for your reply. It appears that I didn't replace the variable name 
"sampletxt" with the argument "x" in my function. I've corrected that and now 
my code seems to be working fine.

Paul


From: Bert Gunter 

Cc: R-help 
Sent: Tuesday, July 11, 2017 2:00 PM
Subject: Re: [R] Extracting sentences with combinations of target words/terms 
from cancer patient text medical records



Have you looked at the CRAN Natural Language Processing Task View? If not, why 
not? If so, why were the resources described there inadequate?

Bert


On Jul 11, 2017 10:49 AM, "Paul Miller via R-help"  wrote:

Hello All,
>
>I need some help figuring out how to extract combinations of target 
>words/terms from cancer patient text medical records. I've provided some 
>sample data and code below to illustrate what I'm trying to do. At the moment, 
>I'm trying to extract sentences that contain the word "breast" plus either 
>"metastatic" or "stage IV".
>
>It's been some time since I used R and I feel a bit rusty. I wrote a function 
>called "sentence_match" that seemed to work well when applied to a single 
>piece of text. You can see that by running the section titled
>
>"Working code". I thought that it might be possible easily to apply my 
>function to a data set (tibble or df) but that doesn't seem to be the case. My 
>unsuccessful attempt to do this appears in the section titled "Non-working 
>code".
>
>If someone could help me get my code up and running, that would be greatly 
>appreciated. I'm using a lot of functions from Hadley Wickham's packages, but 
>that's not particularly necessary. Although I have only a few entries in my 
>sample data, my actual data are pretty large. Currently, I'm working with over 
>a million records. Some records contain only a single sentence, but many have 
>several paragraphs. One concern I had was that, even if I could get my code 
>working, it would be too inefficient to handle that volume of data.
>
>Thanks,
>
>Paul
>
>
>library(tidyverse)
>library(stringr)
>library(lubridate)
>
>sentence_match <- function(x){
>  sentence_extract <- str_extract_all(sampletxt, boundary("sentence"), 
> simplify = TRUE)
>  sentence_number <- intersect(str_which(sentence_ extract, "breast"), 
> str_which(sentence_extract, "metastatic|stage IV"))
>  sentence_match <- str_c(sentence_number, ": ", sentence_extract[sentence_ 
> number], collapse = "")
>  sentence_match
>}
>
> Working code 
>
>sampletxt <- "This sentence contains the word metastatic and the word breast. 
>This sentence contains no target words."
>
>sentence_match(sampletxt)
>
> Non-working code 
>
>sampletxt <-
>  structure(
>list(
>  PTNO = c(1, 2, 2, 2),
>  DATE = structure(c(16436, 16436, 16832, 16845), class = "Date"),
>  TYPE = c("Progress note", "CAT scan", "Progress note", "Progress note"),
>  TVAR = c(
>"This sentence contains the word metastatic. This sentence contains 
> the term stage IV.",
>"This sentence contains no target words. This sentence also contains 
> no target words.",
>"This sentence contains the word metastatic and the word breast. This 
> sentence contains no target words.",
>"This sentence contains the words breast and the term metastatic. This
>sentence contains the word breast and the term stage IV."
>  )
>),
>.Names 

Re: [R] Extracting sentences with combinations of target words/terms from cancer patient text medical records

2017-07-13 Thread Robert McGehee
Hi Paul,
No need to collapse the information into a single text string, gregexpr() can 
take a vector of strings (sentences in your case). You can split your sentences 
up, number them how you want, then search for your pattern either via regex or 
via these extra packages you use which probably use the PCRE regex library 
anyway. However, as this is basically what you did, I'm not sure why you're not 
happy with your existing approach.



-Original Message-
From: Paul Miller [mailto:pjmiller...@yahoo.com] 
Sent: Thursday, July 13, 2017 3:01 PM
To: Robert McGehee 
Cc: r-help@r-project.org
Subject: Re: [R] Extracting sentences with combinations of target words/terms 
from cancer patient text medical records

Hi Robert,

Thank you for your reply. An attempt to solve this via a regular expression 
query is particularly helpful. Unfortunately, I don't have much time to play 
around with this just now. Ultimately though, I think I would like to implement 
a solution something along the lines of what you have done. I have a book on 
regular expressions that I am now starting to read. In the meantime, the code 
I'm using is a good way to assess the feasibility of some ideas I'd like to 
implement.

The advantage of your approach I think is that it makes fewer passes through 
the data. That should make it a lot faster and more efficient than what I've 
done. I'm currently working with a little more than 2.5 million text records 
and I think that number will only rise. So efficiency really should matter. 

I've pasted the latest version of my sample code below. This shows how I'd like 
to add the result of the text search as a column in a data frame. It also shows 
how I'd like to append the sentence number to each identified sentence. The 
single colon that appears where there is no match is not by design. It's 
something that I need to tidy.

My sense is that if I used your regular expression as written, I'd lose the 
information about the sentence number when I added the result as a column in my 
data frame. Presumably, I'd need to collapse the information into a single text 
string, and then the numbering would be lost. If you were going to get the 
sentence numbers as well, without making several passes through the data like 
my code does, how would you go about it?

Thanks,

Paul


library(tidyverse)
library(stringr)
library(lubridate)
 
sentence_match <- function(x){
  sentence_extract <- str_extract_all(x, boundary("sentence"), simplify = TRUE)
  sentence_number <- intersect(str_which(sentence_extract, "breast"), 
str_which(sentence_extract, "metastatic|stage IV"))
  sentence_match <- str_c(sentence_number, ": ", 
sentence_extract[sentence_number], collapse = "")
  sentence_match
}
 
sampletxt <-
  structure(
list(
  PTNO = c(1, 2, 2, 2),
  DATE = structure(c(16436, 16436, 16832, 16845), class = "Date"),
  TYPE = c("Progress note", "CAT scan", "Progress note", "Progress note"),
  TVAR = c(
"This sentence contains the word metastatic. This sentence contains the 
term stage IV.",
"This sentence contains no target words. This sentence also contains no 
target words.",
"This sentence contains the word metastatic and the word breast. This 
sentence contains no target words.",
"This sentence contains the words breast and the term metastatic. This 
sentence contains the word breast and the term stage IV."
  )
),
.Names = c("PTNO", "DATE", "TYPE", "TVAR"),
class = c("tbl_df",
  "tbl", "data.frame"),
row.names = c(NA,-4L)
  )
 
sampletxt$EXTRACTED <- sapply(sampletxt$TVAR, sentence_match)
sampletxt$EXTRACTED
 
> sampletxt$EXTRACTED
[1] ": 
"   
   
[2] ": 
"   
   
[3] "1: This sentence 
contains the word metastatic and the word breast. 
" 
[4] "1: This sentence contains the words breast and the term metastatic. 2: 
This sentence contains the word breast and the term stage IV."


From: Robert McGehee 
To: Paul Miller ; Bert Gunter  
Cc: "r-help@r-project.org" 
Sent: Wednesday, July 12, 2017 12:47 PM
Subject: RE: [R] Extracting sentences with combinations of target words/terms 
from cancer patient text medical records



Hi Paul,
Sounds like you have your answer, but for fun I thought I'd try solving your 
problem using 

Re: [R] PROC MIXED RANDOM equivalence in R nlme

2017-08-11 Thread Robert Baer



On 8/10/2017 8:34 AM, Dennis F. Kahlbaum wrote:

-- snip --
I don't have real help, but I'll remind you that R is case sensitive, 
and it looks like that will be at least one problem in the solution your 
are working on below:

lme not LME
data not DATA
random = RANDOM

--

The R code I've devised for the PROC MIXED statement is shown below:

--
FitTHC <- LME(ln_thc ~ rv + t5 + t9 + ar + ol + ox + su + bz,
  DATA = emiss,
  RANDOM = ??? )
--

As indicated, the problem I'm having is in constructing the equivalent 
code for the RANDOM and any remaining settings. I've tried


RANDOM = ~1 + rv + t5 + t9 + ar + ol + ox + su + bz | new)

but R hangs 
Are the items in your random formula columns in a dataframe named 
emiss?  Do they have data types?  Even if the data are proprietary some 
fake data can make the problem more concrete.
You are saying "gets caught in a processing loop that produces no errors 
or warnings"???


and never produces a result. Therefore, what is the equivalent code 
for the SAS RANDOM?


Thanks!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting SAS Code

2017-09-30 Thread Robert Baer


On 9/29/2017 3:37 PM, Rolf Turner wrote:
> On 30/09/17 07:45, jlu...@ria.buffalo.edu wrote:
>
> 
>
>>
>> The conceptual paradigm for R is only marginally commensurate with 
>> that of
>> standard statistical software.
>> You must immerse yourself in R to become proficient.
>
> Fortune nomination.
For newer list members wondering what Rolf is talking about try:

library(fortunes) fortune() to get a flavor! There are many pearls of 
wisdom.


>
> cheers,
>
> Rolf
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] x-axis tick marks on log scale plot

2016-05-20 Thread Robert Baer
Very, very nice.  Thanks for sharing.


On 5/20/2016 4:21 AM, Martin Maechler wrote:
>> Brian Smith 
>>  on Thu, 19 May 2016 11:04:55 -0400 writes:
>  > Thanks all !!  On Thu, May 19, 2016 at 9:55 AM, Ivan
>  > Calandra  wrote:
>
>  >> Hi,
>  >>
>  >> You can do it by first plotting your values without the
>  >> x-axis: plot(x,y,log="xy", xaxt="n")
>  >>
>  >> and then plotting the x-axis with ticks where you need to:
>  >> axis(side=1, at=seq(2000,8000,1000))
>
> Getting nicer looking axis ticks  for log-scale axes (and
> traditional graphics) I have created the function
> eaxis()
> and utility functionpretty10exp(.)
>
> and I also created standard R's  axTicks(.)  to help with these.
>
>  if(!require("sfsmisc")) install.packages("sfsmisc")
>  require("sfsmisc")
>
>  x <- lseq(1e-10, 0.1, length = 201)
>  plot(x, pt(x, df=3), type = "l", xaxt = "n", log = "x")
>  eaxis(1)
>
> gives the attached plot
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Documenting data

2016-06-30 Thread Robert Baer

You might look at:

http://stackoverflow.com/questions/7979609/automatic-documentation-of-datasets

You might also, try the  FIle | Compile Notebook  from within R-Studio 
(https://www.rstudio.com/) on your well-documented R-scripts to get a 
nice reproducible recording/report of data analysis workflow.  Similar 
functionality is available from basic R, but involves more work.  There 
are many other approaches, but the best choice depends on your precise 
needs.


And, as a programmer, you are probably already familiar with things like:
https://google.github.io/styleguide/Rguide.xml



On 6/30/2016 9:51 AM, Pito Salas wrote:

I am studying statistics and using R in doing it. I come from software 
development where we document everything we do.

As I “massage” my data, adding columns to a frame, computing on other data, 
perhaps cleaning, I feel the need to document in detail what the meaning, or 
background, or calculations, or whatever of the data is. After all it is now 
derived from my raw data (which may have been well documented) but it is “new.”

Is this a real problem? Is there a “best practice” to address this?

Thanks!

Pito Salas
Brandeis Computer Science
Feldberg 131

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Windows 10 Application Compatibility Check | FreeWare R Statistical Environment v3.2.2

2016-07-26 Thread Robert Baer

Runs fine on Windows 10 for me.


On 7/25/2016 7:18 AM, Ramar, Rohini wrote:

Hello Team,

We are, Citi Application Readiness Team, need your assistance in order to gather info 
about below application compatibility and support for Win 10 as part of Window 10 
Readiness initiative. CITI Bank has been using below "FreeWare R Statistical 
Environment v3.2.2"  software products currently on Win 7 operating system.

We would like to know whether the below listed application is compatible and 
supported even for Win 10 (64 Bit) or is there any other higher version of 
application which would be compatible for Win10. If you have not tested for Win 
10, could you please provide us with a tentative date by when we can reach you.

Application Name : FreeWare R Statistical Environment v3.2.2


Note: Kindly re-direct this email to appropriate team if we reached you wrongly.


Regards,
Rohini R
Citi Architecture & Technology Engineering
Client Computing
Direct Phone #:+91 22 3346 1497
Email ID: rohini.ra...@citi.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importint stata file and using value labels

2016-08-27 Thread Robert Baer
There has been some good advice not to lose the labels, but perhaps this 
gets you where you seem determined to go?


?read.dta

read.dta(file, convert.dates = TRUE, convert.factors = TRUE,
 missing.type = FALSE,
 convert.underscore = FALSE, warn.missing.labels = TRUE)

or

library(readstata13)

?read.dta13

read.dta13(file, convert.factors = TRUE, generate.factors = FALSE,
  encoding = NULL, fromEncoding = NULL, convert.underscore = FALSE,
  missing.type = FALSE, convert.dates = TRUE, replace.strl = FALSE,
  add.rownames = FALSE, nonint.factors = FALSE)

Perhaps the convert. factors setting at FALSE?


On 08/27/2016 10:55 AM, Michael Friendly wrote:

On 8/26/2016 11:05 AM, Juan Ceccarelli Arias wrote:

Yep. Im a bit stalled.
I can't find the option to import only the values and drop the value 
labels

from the dta file.
Im quite sure R can do that. Then i'd only used the values and i'd 
rely on

my memory.
It isn't a bad alternative.



Hint: use str() to see the class of what you've read.
Then try as.data.frame() on the resulting object read from the .dta file.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loading .rda file

2016-08-30 Thread Robert Baer
I think that the .rda extension is the old extension convention for what 
now gets the .RData extension name by convention.


These are basically workspaces. These .RData files can contain multiple 
data objects, and all objects seem to read back in with the same name 
that they were saved with using the save() function.  Of course, you can 
assign a new name to the objects you read in with the standard <- or -> 
syntax.


See ?save, to lean how to save them with the new name.  You can save 
just an individual data object in an .rda or .RData file and make the 
data object name match the filename if you so wish.



On 8/30/2016 9:37 AM, Leslie Rutkowski wrote:

Hi,

I'm slowly migrating from SAS to R and - for the very first time - I'm
working with a native .Rda data file (rather than importing data from other
sources). When I load this .Rda file into the global environment using
load("file path") I see a data.frame in the global environment called
"mydata" that corresponds to the .rda file.

My question: how can I change the name of this data.frame to something of
my choosing?

Thanks for considering this very simple question.

Leslie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Same code on Mac?

2016-09-01 Thread Robert Baer

On 9/1/2016 9:44 AM, Sarah Goslee wrote:

On Wed, Aug 31, 2016 at 4:25 PM, Tom Mosca  wrote:

Using a PC I have written the R code for my elementary statistics students.  
One of the students has a Mac.  Should the same lines of code work on a Mac?

Where can the student find support for R on her Mac?  I don't know anything 
about them, and have never used one.



There's an official FAQ for Mac, just as there is for Windows.
https://cran.r-project.org/faqs.html
There's also a Mac-specific email help list.
https://www.r-project.org/mail.html

Most R code will run as well or better on Mac. All of the OS problems
I've run into tend to be problems with Windows. It's a bit harder to
get some geospatial stuff working on Mac, but that's unlikely to be a
problem with your elementary stats students.
Sarah has pointed you at some Mac support, but some additional advice as 
to the student audience.   [I live in a Windows world most of the time 
and a Ubuntu world the rest of the time, so I have minimal knowledg of 
OSX].Having students install RStudio has really helped because it 
brings cross-platform commonality here and there.


The biggest problems I've run into with beginning statistics students 
are the issues related to getting them connected to our network and/or 
reading in textbook datasets located on that network.  Differences in 
handling of line endings on text files. However, if you do ground work 
to show them how to do some basic basic things early, they can support 
themselves with these things. The commands will work the same


For text files I often have (Windows) students copy data to the 
clipboard and use a command like  x <- read.table(file = 'clipboard', 
sep = '\t', header = TRUE)  so we can work through some statistical 
tests or graphing.   This won't work on a Mac.  An equivalent 
formulation that is helpful on the Mac is x <- read.table(file = 
pipe('pbpaste'), sep = '\t', header = TRUE)


Other than that, I think you'll find R extremely OS agnostic in a 
teaching environment.


--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return the indices of rows of a data frame

2016-09-20 Thread Robert Baer



On 9/19/2016 10:37 PM, John wrote:

Hi,

I have the following dataframe:


temp<-data.frame(a=c(1,1,2), b=2:4, c=1:3)
row.names(temp)<-c("D", "E", "F")
temp

   a b c
D 1 2 1
E 1 3 2
F 2 4 3

I would like R to tell me which rows has value "a" equal to 1. The
answer is the first row and the second row, or row D and row E. Which
function should i use? function subset? function which?


row.names(temp[temp$a==1,])

--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a "corrupted file" after using download.file() with R on Windows 7

2016-09-29 Thread Robert Baer

On 9/28/2016 11:32 PM, Fabien Tarrade wrote:

Hi there,

Sometime download.file() failed to download the file and I would like 
to remove the correspond file.

No answers, but a couple of additional questions:
1)  Does the issue persist if you close R or does the file remain locked 
against deletion?
2) If so, is there a related process in the task list if you use 
CTRL-ALT-DEL?

3) Does   print(e$message) yield any useful information when it hangs?

Would debugging in R Studio shed additional light?

The issue is that I am not able to do it and Windows complain that the 
file is use by another application.
I try to closeAllConnections(), or unlink() before removing the file 
but without sucess.


Any idea how I should proceed &

Please find the code below

 # consider warning as an error
  options(warn=2)

  # try to download the file
  tryCatch({
download.file(url,path_file,mode="wb",quiet=quiet)
return(0)
  },error = function(e){
if(verbose){
  print(e)
  print(e$message)
}
# close file when it failed
if (file.exists(path_file)){
  closeAllConnections()
  #unlink(path_file, recursive=TRUE)
  #file.create(path_file,overwrite=TRUE,showWarning=TRUE)
  #system(paste0('open "', path_file, '"'))
  file.remove(path_file,overwrite=TRUE,showWarning=TRUE)
}
return(1)
    }
)

Thanks a lot
Cheers
Fabien



--


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
660-626-2321 Department
660-626-2965 FAX

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Downloading file.Rdata from internet

2015-05-04 Thread Robert Baer


On 5/3/2015 7:58 PM, Rafael Costa wrote:
> Dear R users,
>
> To load the file into "http://www.datafilehost.com/d/c7f0d342";, I first
> uncheck the "Use our download manager and get recommended downloads" option
> and I click the "DOWNLOAD" button. How do I load and save the file directly
> from R?
>
> Any help on this is most appreciated.
Use the load command:
?load

On windows it might look something like:

load("C:/Users/Rafael Costa/Downloads/tabela1.1.RData")



> Thanks in advance,
> Rafael Costa.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 


Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
rbaer(at)atsu.edu


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gdata library 2.16.1

2015-05-04 Thread Robert Baer



On 5/4/2015 9:01 AM, Marc Girondot wrote:

Dear list-members,

Since I update gdata library to 2.16.1 version this morning, I have an 
error on the two macs I use (details on system and R versions at the 
end).


When I load the package, I have this error:

> library("gdata", 
lib.loc="/Library/Frameworks/R.framework/Versions/3.2/Resources/library")

gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLSX' (Excel 2007+) files.

gdata: Run the function 'installXLSXsupport()'
gdata: to automatically download and install the perl
gdata: libaries needed to support Excel XLS and XLSX formats.

Then if I try installXLSXsupport(), I get another error:
> installXLSXsupport()
Error in installXLSXsupport() :
Unable to install Perl XLSX support libraries.

But my perl system seems to be ok:
> system("perl -v")

This is perl 5, version 16, subversion 3 (v5.16.3) built for 
darwin-thread-multi-2level


and the perl folder is correctly located in 
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/gdata/perl/


And of course if I try to use read.xls, I get an error (xxx.xlsx is a 
valid file):

> info <- read.xls("xxx.xlsx"), stringsAsFactors=FALSE)
WARNING: Perl module Spreadsheet::ParseXLSX cannot be loaded.
WARNING: Microsoft Excel 2007 'XLSX' formatted files will not be 
processed.


Does someone have a solution ? (other than saving file in .csv ! )

Don't know about this problem, but much has been written on various 
alternatives (e.g., http://www.milanor.net/blog/?p=779)

One of their suggestions is (cross-platform, java-based solution),

require(XLConnect)
wb=loadWorkbook("myfile.xlsx")
df=readWorksheet(wb,sheet="Sheet1",header=TRUE)







Thanks

Marc


R version 3.2.0 Patched (2015-05-01 r68301) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

OS: MacOSX - Yosemite

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--


Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
rbaer(at)atsu.edu

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Special character is graph label

2015-05-06 Thread Robert U
Dear R users,
I am having issues finding a special character (and how to insert it) in the 
lab of a graph axis. 

Let us say that the label of my axis is "X", i would like the X to have a 
"line" over it, indicating that it is the "mean of X values" (i don't even know 
how to properly state that in english...). Does someone understand, and have 
any idea about how to do that?
Greetings,
R.H

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parsing large amounts of csv data with limited RAM

2015-07-14 Thread Dupuis, Robert
I'm relatively new to using R, and I am trying to find a decent solution for my 
current dilemma.

Right now, I am currently trying to parse second data from a 7 months of CSV 
data. This is over 10GB of data, and I've run into some "memory issues" loading 
them all into a single dataset to be plotted. If possible, I'd really like to 
keep both the one second resolution, and all 100 or so columns intact to make 
things easier on myself.

The problem I have is that the machine that is running this script only has 8GB 
of RAM. I've had issues parsing files with lapply, and some sort of csv reader. 
So far I've tried read.csv, readr.read_table, and data.table.fread with only 
fread having any sort of memory management (fread seems to crash on me 
however). The basic approach I am using is as follows:

# Get the data
files = list.files(pattern="*.csv")
set <- lapply(files, function(x) fread(x, header = T, sep = ',')) #replace 
fread with something that can parse csv data

# Handle the data (Do my plotting down here)
...

These processes work with smaller data sets, but I would like to in a worse 
case scenario be able to parse through 1 year data which would be around 20GB.

Thank you for your time,
Robert Dupuis

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R GUI plot by color

2015-07-26 Thread Robert Baer



On 7/24/2015 6:23 AM, Jim Lemon wrote:

Hi jpara3,
Your example, when I got it to go:

one<-c(3,2,2)
two<-c("a","b","b")
data<-dataframe(one,two)
plot(data$one,col=data$two)
Wow Jim. Psychic indeed!  Not only did you answer with NO reproducible 
example, but on round 2 you fixed a non-working example and explained 
why it was an accident that it works.  What is the stock market about to 
do? :)


jpara3 - Those of us without Jim's talent can be more helpful if you 
read and follow the guide at the bottom of each email.:


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





does indeed work, and I'll explain how. You are plotting the values of
data$one against the _values_ of data$two (see point 3 of my
response). In this case, the values of data$two are of class "factor",
which means that they have numeric values attached to the levels (a,
b) of the factor. When you pass these values as the "col" argument,
they are silently converted to their numeric values (1,2,2). In the
default palette, these numbers represent the colors - black, red, red.
Those are the colors in which the points are plotted. So far, so good.
Let's look at the other two points that I guessed.

1) The column names of data2 are not numbers

colnames(data)
[1] "one" "two"

As you can see, the column names are character variables, and they
don't translate to numbers:

as.numeric(colnames(data))
[1] NA NA

2) The number of columns in data2 is not equal to the number of values
in data1 that you are plotting

It's pretty obvious that there are two values in the column names and
three in the vector of values that you are plotting in your
example.So, I think I got three out of three without knowing what the
data were.

Jim


On Fri, Jul 24, 2015 at 7:53 PM, jpara3  wrote:

I have done a trial with a dataframe like this:
one<-c(3,2,2)
two<-c(a,b,b)
data<-dataframe(uno,dos)

plot(data$one,col=data$two)

and it plots perfect.
If you paste the code above in R, it has errors and does NOT plot 
perfectly.   I still did not understand what you were trying to do. You 
owe Jim big time.



If I try it with the code that i have post in the first message, selecting
data1 and data2 as i nthis example, the plot is plotted, but all dots with
the same color.

Thanks for the answer but noone of the 3 topics is the root problem.





--
View this message in context: 
http://r.789695.n4.nabble.com/R-GUI-plot-by-color-tp4710297p4710300.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] testing whether two character vectors contain (the same) items in the same order

2015-08-08 Thread Robert Baer



On 8/6/2015 5:25 AM, Federico Calboli wrote:

Hi All,

let’s assume I have a vector of letters drawn only once from the alphabet:

x = sample(letters, 15, replace = F)
x
  [1] "z" "t" "g" "l" "u" "d" "w" "x" "a" "q" "k" "j" "f" "n" “v"

y = x[c(1:7,9:8, 10:12, 14, 15, 13)]

I would now like to test how good a match y is for x.  Obviously I can 
transform the letters in numbers and use a rank test, but I was left wondering 
whether this is the only solution and whether there are more appropriate 
solutions that are already implemented in R (I am not going to reinvent the 
wheel if I can avoid it).

BW

F

Perhaps
install.packages("stringdist")
help(package = 'stringdist')







--
Federico Calboli
Ecological Genetics Research Unit
Department of Biosciences
PO Box 65 (Biocenter 3, Viikinkaari 1)
FIN-00014 University of Helsinki
Finland

federico.calb...@helsinki.fi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testing whether two character vectors contain (the same) items in the same order

2015-08-08 Thread Robert Baer

And I probably should have included this link:
http://journal.r-project.org/archive/2014-1/loo.pdf

On 8/8/2015 12:50 PM, Robert Baer wrote:



On 8/6/2015 5:25 AM, Federico Calboli wrote:

Hi All,

let’s assume I have a vector of letters drawn only once from the 
alphabet:


x = sample(letters, 15, replace = F)
x
  [1] "z" "t" "g" "l" "u" "d" "w" "x" "a" "q" "k" "j" "f" "n" “v"

y = x[c(1:7,9:8, 10:12, 14, 15, 13)]

I would now like to test how good a match y is for x.  Obviously I 
can transform the letters in numbers and use a rank test, but I was 
left wondering whether this is the only solution and whether there 
are more appropriate solutions that are already implemented in R (I 
am not going to reinvent the wheel if I can avoid it).


BW

F

Perhaps
install.packages("stringdist")
help(package = 'stringdist')







--
Federico Calboli
Ecological Genetics Research Unit
Department of Biosciences
PO Box 65 (Biocenter 3, Viikinkaari 1)
FIN-00014 University of Helsinki
Finland

federico.calb...@helsinki.fi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Basic editing of XML file

2015-08-13 Thread Robert U
Dear RUser,I’m tryingto operate some very slight editing to the values of an 
XML file. I looked abit everywhere and it appears that dealing with XML files 
is not that easy… besidemy XML files might be a bit weirdly structured. Anyway, 
let me give you an exampleof it :
Root(xmlfile)        …  ………   I’d like to modify the value of say,X1 
in A1 line, or X1 in A2 line. Unfortunately the structure of this datasetdoes 
not really look like the examples I’ve seen on the internet, where youhave 
something that look like that :  something to change  In my case as 
you noticed, thevalues I want to edit are inside de <>, which is pretty odd… 
Any tips?  Regards
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Two-tailed exact binomial test with binom.test and sum(dbinom(...))

2014-12-13 Thread Robert Zimbardo
Hi R experts,

I have a few related questions that are actually a combination of an R
and a hopefully not too  trivial (?) statistics question, namely
regarding the computation of an exact two-tailed binomial test.

Let's assume the following scenario:
- number of trials = 10
- p of success = 0.6

(a) Let's also assume we have an H1 that there are more than 6
successes and the number of successes we get is 8. In that case, we do
sum(dbinom(8:10, 10, 0.6)) # 0.1672898
binom.test(8, 10, 0.6, alternative="greater") # 0.1673

(b) Now let's assume we have an H1 that there are fewer than 6
successes and the number of successes we get is 2. In that case, we do
sum(dbinom(0:2, 10, 0.6)) # 0.01229455
binom.test(2, 10, 0.6, alternative="less") # 0.01229

So far no problem. My questions are now concerned with a two-tailed test:

(1). My understanding would be that, if we have an H1 that says "the
number of successes won't be 6", then we can add up the two
probabilities from above:
sum(dbinom(8:10, 10, 0.6)) + sum(dbinom(0:2, 10, 0.6)) # 0.1795843, or just
sum(dbinom(c(0:2, 8:10), 10, 0.6)) # 0.1795843

However, that is not what binom.test(..., alternative="two.sided") does:
binom.test(2, 10, 0.6, alternative="two.sided") # 0.01834, which is
the method of small(er) p-values:
sum(dbinom(0:10, 10, 0.6)[dbinom(0:10, 10, 0.6)<=dbinom(2, 10, 0.6)])
# 0.01834117

Thus, question 1) is, is there a reason binom.test is implemented the
way it is rather than the other way?

(2) I am struggling to understand two-tailed scenarios like this one:
- number of trials = 235
- p of success = 1/6
- successes = 51

That is, cases where my logic of taking the successes+1 extreme cases
on each tail don't work: adding the point probabilities of 51:235 is
fine, but it of course makes no sense to add the point probabilities
for 0:185 to that
sum(dbinom(51:235, 235, 1/6)) # 0.02654425
sum(dbinom(0:185, 235, 1/6)) # 1 (!)

So, while binom.test again does its small(er) p-value thing, ...
binom.test(51, 235, 1/6, alternative="two.sided") # 0.04375
sum(dbinom(0:235, 235, 1/6)[dbinom(0:235, 235, 1/6)<=dbinom(51, 235,
1/6)]) # 0.04374797

... I am wondering how my approach with adding the probabilities of
the same number of events from each tail would be done here ...?

(3) What is people's view on computing the two-tailed test like this,
which leads to an ns result unlike binom.test?
2*sum(dbinom(51:235, 235, 1/6)) # 0.05308849

Any input would be much appreciated!

R.Z.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pairing columns based on a value

2014-12-17 Thread Robert Strother
I have a large dataset (~50,000 rows, 96 columns), of hospital
administrative data.
many of the columns are clinical coding of inpatient event (using ICD-10).
A simplified example of the data is below

> dput(dat_unmatched)
structure(list(ID = structure(c(4L, 3L, 2L, 1L), .Label = c("BCM3455",
"BZD2643", "GDR2343", "MCZ4325"), class = "factor"), X.1 = structure(c(2L,
3L, 1L, 1L), .Label = c("B83.2", "C23.2", "F56.23"), class = "factor"),
X.2 = structure(c(2L, 1L, 2L, 2L), .Label = c("M20.64", "T43.2"
), class = "factor"), X.3 = structure(c(2L, 3L, 3L, 1L), .Label =
c("F56.23",
"R23.1", "Y32.1"), class = "factor"), X.4 = structure(c(1L,
2L, 2L, 3L), .Label = c("M23.5", "T44.2", "Y32.1"), class = "factor"),
X.5 = structure(c(1L, 2L, 1L, 2L), .Label = c("", "Q23.6"
), class = "factor")), .Names = c("ID", "X.1", "X.2", "X.3",
"X.4", "X.5"), class = "data.frame", row.names = c(NA, -4L))

I am interested in a set of codes that start with a "T" or a "Y", and link
them to the preceding column that does not begin with a "T" or "Y".   I
suspect I will need to use regular expressions, and likely a loop, but I am
really out of my depth at this point.

I would like the final dataset to look like:

> dput(dat_matched)
structure(list(ID = structure(c(4L, 3L, 2L, 1L), .Label = c("BCM3455",
"BZD2643", "GDR2343", "MCZ4325"), class = "factor"), X.1 = structure(c(2L,
3L, 1L, 1L), .Label = c("B83.2", "C23.2", "M20.64"), class = "factor"),
X.2 = structure(c(1L, 2L, 1L, 1L), .Label = c("T43.2", "Y32.1"
), class = "factor"), X.3 = structure(c(1L, 4L, 2L, 3L), .Label = c("",
"B83.2", "F56.23", "M20.64"), class = "factor"), X.4 = structure(c(1L,
2L, 3L, 3L), .Label = c("", "T44.2", "Y32.1"), class = "factor"),
X.5 = structure(c(1L, 1L, 2L, 1L), .Label = c("", "B83.2"
), class = "factor"), X = structure(c(1L, 1L, 2L, 1L), .Label = c("",
"T44.2"), class = "factor")), .Names = c("ID", "X.1", "X.2",
"X.3", "X.4", "X.5", "X"), class = "data.frame", row.names = c(NA,
-4L))

Any help appreciated.

Matthew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interview questions?

2015-01-11 Thread Robert Sherry
Here is a question that I might ask. What are the alternatives to R and 
how does R compare? That is, for what class of problems is R the best 
tool around?


Bob

On 1/11/2015 1:16 PM, Barry Rowlingson wrote:

Ask if they have a favourite R programmer. This will tell you how much into
the R culture they are, and perhaps also tell you if their opinions of a
good programmer concur with yours...
On 11 Jan 2015 16:49, "Keith S Weintraub"  wrote:


Folks,

I was wondering if anyone has put together a list of R job interview
questions?

I’m thinking of about 5-20 possibly open ended questions for interviewing
a candidate to do R programming. Just programming. Not statistics or
mathematics.

What I don’t want are tricky “puzzles” that are more about how clever the
interviewer questions are than how to get the best person for the job.

I would consider myself a mid-level R programmer so this would also be a
great opportunity to learn more and be able to hire a great candidate.

I am perfectly happy to get a reference, book title or URL.

Not looking for anyone to do my work for me!

Best to all,
Happy New Year,
KW

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing RStudio

2015-02-12 Thread Robert Baer


On 2/12/2015 10:22 AM, John Sorkin wrote:

Windows 7, 64-bit.
  
I am trying to install RStudio. Before installing RStudio, I installed R 3.1.2. During the installation or R, I installled (as per the default) 32- and 64-bit packages. When I tried to install RStudio, I received the message

R does not appear to be installed. Please install R before using RStudio.
I know R is installed, beacuse I am able to run R.
Can anyone suggest what I can do to get RStudio installed?
Thank you
John
  
Tools > Global opetions > general, and fill in the top box to point the 
top line at the Installed version of R you wish to use. Usually, it will 
do a pretty good job of finding it on its own, but you can always adjust 
you version to suit here.


Rob



John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:18}}


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] API request from R

2015-02-19 Thread Robert Baer

On 2/19/2015 8:06 AM, Barry Rowlingson wrote:
> On Wed, Feb 18, 2015 at 11:44 AM, Mittal Ashra via R-help
>  wrote:
>> Dear All,
>> Apologies for mailing it to the whole crowd. This is Mittal, presently 
>> working in a Project where we have build a platform for displaying 
>> recommendations and the results are based on the statistical models.
>> I have gone through the CRAN repository to look out for an package which 
>> converts the R code into an JAVA API and that can be called from the 
>> platform. However, did not find any. If anyone can guide me to the right 
>> package that will be grateful.
>> The packages can be similar to DeployR from Revolution Analytics.
>   I doubt there's anything smart enough to take a set of R functions
> and magically create all the necessary Java boilerplate code that
> constitutes an implementation of an API in Java (cynics would say Java
> was all boilerplate...).
>
>   There's the rJava package, which includes the JRI system for calling
> R from Java. Then your java can kick off an R "engine" and do R stuff:
I thought rJava called java from R not the other way around.

Description: Low-level interface to Java VM very much like .C/.Call and 
friends. Allows creation of objects, calling methods and accessing fields.




>
>[boilerplate code deleted]
>
>Rengine re=new Rengine(args, false, new TextConsole());
>
>[more deleted boilerplate]
>
>re.eval("data(iris)",false);
>
> What you would have to do would be to write the Java
> functions/methods/classes with the appropriate arguments for your API
> and make them call the R code this way.
>
>   I think RCaller is another way of doing this from Java - its not on
> CRAN since its not an R package, its a Java library.
>
> Barry
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 


Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
rbaer(at)atsu.edu


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] no variable removal when running glmnet on diabetes dataset with alpha=1, lambda=.1

2015-03-19 Thread Feyerharm, Robert
New ZixCorp secure email message from ValueOptions Secure Email

To view the secure message, click on the link below or copy and paste the link 
into your Internet browser address bar.

https://securemail-valueoptions.com/s/e?m=ABC5UVsyBkxRSfS2UXb8k64p&c=ABDfgFbxA3h2RUXHIPdWEiCw&em=R%2dhelp%40r%2dproject%2eorg

You are reading the plaintext version of this message.  For a better user 
experience, change your email settings to enable the viewing of HTML.

Do not reply to this notification message; this message was auto-generated by 
the sender's security system. To reply to the sender, click on the link above.

The secure message expires on Sep 15, 2015 @ 08:04 PM (GMT).

Want to send and receive secure email messages transparently? 
http://www.zixcorp.com/info/zixmail_ZMC



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading in from xlsx files

2015-03-24 Thread Robert Lyons
I'm sorry if this is well below the level of this forum.
Using R Console v3.1.3 32-bit
Both of our R Programming sources left the company and I'm in need of some very 
basic help.
The code is reading in column and row information from two xlsx files.
I made what I thought were some basic changes to the contents of those files, 
one was a correction to a typo for a row, the other was flipping two columns 
that were in the wrong order.
When I Source the R Code neither change shows up in the output.

Thank you for your time.

Cordially,
Bob Lyons



IMPORTANT NOTICE REGARDING THIS MESSAGE:
This message is intended for the use of the person(s) and/or entity(s) to whom 
it is addressed and may contain information that is privileged, confidential, 
and protected from disclosure under applicable law. If you are not the intended 
recipient, your use of this message for any purpose is strictly prohibited. If 
you have received this communication in error, please delete the message and 
notify the sender immediately.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obfuscate AES password

2015-04-14 Thread Robert Baer
I'm not sure I completely understand your authentication needs, but 
perhaps the RCurl package could be of some use to you.


Rob

On 4/13/2015 1:26 AM, Luca Cerone wrote:

Thanks Jeff,
and OK I'll move next questions on the topic to the devel list :)

I was hoping there were packages that already dealt with this sort of
things, that's why I posted my question here in the first place..

Thanks a lot for helping me with this,

Cheers,
Luca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--


Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A T Still University of Health Sciences
800 W. Jefferson St
Kirksville, MO 63501
rbaer(at)atsu.edu

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding unique terms

2018-10-15 Thread Robert Baer




Dear r-users,

I have this data:

structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
 COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
 "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
 PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
 X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

I want to combine the same Student ID and add up all the values for PO1M,
PO1T,...,PO2T obtained by the same ID.

How do I do that?
Thank you for any help given


# load data

# Enter dataframe by hand
dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
    COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
    4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
    "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
    PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
    82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
    100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
    41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
    X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

# Create sums by student ID

library(dplyr)
dat %>%
  group_by(STUDENT_ID) %>%
  summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
    sum.PO1T = sum(PO1M, na.rm = TRUE),
    sum.PO2M = sum(PO1M, na.rm = TRUE),
    sum.PO2T = sum(PO1M, na.rm = TRUE))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding unique terms

2018-10-15 Thread Robert Baer




On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote:

Dear r-users,

I have this data:

structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
 COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
 "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
 PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
 X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

I want to combine the same Student ID and add up all the values for PO1M,
PO1T,...,PO2T obtained by the same ID.

How do I do that?
Thank you for any help given.

oops!  Forgot to clean up after my cut and paste. Solution with dplyr 
looks like this:

# Create sums by student ID
library(dplyr)
dat %>%
  group_by(STUDENT_ID) %>%
  summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
    sum.PO1T = sum(PO1T, na.rm = TRUE),
    sum.PO2M = sum(PO2M, na.rm = TRUE),
    sum.PO2T = sum(PO2T, na.rm = TRUE))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dotchart and its arguments

2019-02-20 Thread Robert Zimbardo
Hi all

I was recently trying to customise a dotchart of a matrix

dats <- matrix(1:6, nrow=2, dimnames=list(R=letters[1:2], C=letters[14:16]))
dotchart(dats)

with pch and pt.cex and noticed some irregularities, namely that R
doesn't use the values in the positions it uses for plotting also for
the arguments of the dotchart function:

dotchart(dats, pch=as.character(dats))   # wrong
dotchart(dats, pch=as.character(dats[,3:1])) # right
dotchart(dats, pch=as.character(dats[,3:1]), pt.cex=dats)   # wrong
dotchart(dats, pch=as.character(dats[,3:1]), pt.cex=dats[,3:1]) # right

Is this a bug or a feature (whose purpose then I don't get)?

Thanks

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error trapping in R

2019-02-28 Thread Robert Knight
Some use try blocks, like found in other languages.  Put the code you want to 
try inside the block.

https://www.robertknight.io/blog/try-blocks-in-r-for-error-handling/ contains a 
quick example.  The example doesn’t raise exceptions or anything, it just 
contains it for you so the script keeps going.  I like handling errors with if 
statements inside of try blocks.

Robert



> On Feb 27, 2019, at 2:55 PM, Bernard Comcast  
> wrote:
> 
> What is the recommended way to trap errors in R? My main need is to be able 
> to trap an error and then skip a section of code if an error has occurred. In 
> VB for Excel I used the “On Error goto  .” construct to do this.
> 
> Bernard
> Sent from my iPhone so please excuse the spelling!"
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using R to Compute Covariance

2014-07-26 Thread Robert Sherry

I have the following data set:
xy   p
11  1/2
22  1/4
39  1/4

In this case, p represents the probability of the values occurring. I
compute the covariance of x and y by hand and come up with a value of 41/16.
When computing the covariance, I am dividing by n (in this case 3) not n-1.

I now want to use R to find the covarinace. I understand that R will divided
by n-1 not n.  Here are the commands that I issued:

x = c(1,2,3)
y = c(1,2,9)
df =dataframe(x,y)
w1 = c(1/2,1/4,1/4)
cov.wt(df, wt = w1 )

The last command returns:

$cov
xy
x 1.1  4.1
y 4.1 17.9

$center
   xy
1.75 3.25

$n.obs
[1] 3

$wt
[1] 0.50 0.25 0.25

Therefore, I conclude that R is finding the covariance of x and y to be 4.1.
However, I need to adjust that number by multiplying it by 2 and then
dividing by 3. However, when I get that I still do not get 41/16.  What am I
missing?

I thank the group in advance for their responses.

Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R to Compute Covariance

2014-07-26 Thread Robert Sherry

David,

Thanks for the response. I believe you have solved my problem.

Bob

On 7/26/2014 3:50 PM, David Winsemius wrote:


On Jul 26, 2014, at 11:07 AM, Robert Sherry wrote:


I have the following data set:
xy   p
11  1/2
22  1/4
39  1/4

In this case, p represents the probability of the values occurring. I
compute the covariance of x and y by hand and come up with a value of 
41/16.
When computing the covariance, I am dividing by n (in this case 3) 
not n-1.


I now want to use R to find the [covariance]. I understand that R 
will [divide]

by n-1 not n.


Please read what the help page says about the choice of the method 
parameter.




 Here are the commands that I issued:

x = c(1,2,3)
y = c(1,2,9)
df =dataframe(x,y)


# There's no function named 'dataframe'.\\


df =data.frame(x,y)
w1 = c(1/2,1/4,1/4)
cov.wt(df, wt = w1 )



> cov.wt(df, wt = w1 ,method="ML")$cov[2,1]
[1] 2.5625
> all.equal (41/16, cov.wt(df, wt = w1 ,method="ML")$cov[2,1] )
[1] TRUE



The last command returns:

$cov
   xy
x 1.1  4.1
y 4.1 17.9

$center
  xy
1.75 3.25

$n.obs
[1] 3

$wt
[1] 0.50 0.25 0.25

Therefore, I conclude that R is finding the covariance of x and y to 
be 4.1.

However, I need to adjust that number by multiplying it by 2 and then
dividing by 3. However, when I get that I still do not get 41/16.  
What am I

missing?




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] K-nearest neighbor

2014-08-07 Thread Robert U
Dear R-users,

I am looking for a weighted knn-search function, but i cannot manage to find 
one. There are several options of weighted knn classifiers, but i would rather 
use a simple 'search function' (such as get.knnx). Anyone knows a search 
function with "weight" option ?

Thanks
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Updates to R Core and R Foundation Membership

2014-09-15 Thread Robert Gentleman
Hi all,
  It is my pleasure to announce new members to R Core and to the R
Foundation
 whose efforts will be most appreciated as R continues to evolve and
advance.

There are 2 new R core members:  Martin Morgan and Michael Lawrence.
In addition Stefano Iacus has decided to step down from R Core.

   There are 7 new R foundation members:
  Dirk Eddelbuettel, Torsten Hothorn, Marc Schwartz,
  Hadley Wickham, and Achim Zeileis, Martin Morgan and Michael Lawrence.
  The R Foundation now has 29 ordinary members.

  Please join me in welcoming them to their new roles and especially in
thanking
 Stefano for his many years of contributions.


  best wishes
Robert

 for the R Foundation

-- 
Robert Gentleman
rgent...@gmail.com

[[alternative HTML version deleted]]

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about range of letters

2014-10-04 Thread Robert Baer


On 10/4/2014 8:21 AM, Nia Gupta wrote:

Hello,

I have a column with a bunch of letters. I would like to keep some of these 
letters (A,C,D,L) and turn the rest into 'X'.

I have tried using ifelse with '|' in between the argument but it didn't work 
nor did 4 separate ifelse statements.

Example, I currently have:
LettersABCDE
I would like to have:
LettersAXCDX
Thank you

[[alternative HTML version deleted]]

try:
let = sample(LETTERS[1:5],100,replace=TRUE)
let
let1 =  ifelse(let %in% c('A','C','D'),let,'X')
let1


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help Getting the Price of Gold

2014-10-23 Thread Robert Sherry
I am trying to get the current price of gold for my application. I am 
using the library quantmod. The
R commands I use are:
  getMetals(c('XAU'), from=Sys.Date(), autoassign = FALSE )
  XAUUSD$XAU.USD[1,1]

I would expect the value in  XAUUSD$XAU.USD[1,1] to be a scalar but it 
comes back with a date and a number. All I want is the current value, 
not today's date. How do I just get the value?

Thanks
Bob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem getting Option Quotes

2014-10-23 Thread Robert Sherry


I am using R and quantmod to get stock and option quotes. However, it 
has stopped working. I expect the following

function call to produce a list of options:
getOptionChain( "XOM", Exp = "2015-01-20" )
However, I get the following error messages:
Error in lapply(strsplit(opt, ""), function(.) gsub(",", "", 
gsub("N/A",  :

  subscript out of bounds
In addition: Warning message:
In readLines(paste(paste("http://finance.yahoo.com/q/op?s=";, 
Symbols,  :
  incomplete final line found on 
'http://finance.yahoo.com/q/op?s=XOM&m=2015-01-20+Options'


Has something changed? Am I doing something wrong?

Thanks
Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error message from allEffects(model) / effect(model)" ‘range’ not meaningful for factors"

2015-08-19 Thread Robert Zimbardo
Hi

I cannot figure out why the effects package throws me error messages
with the following simple code:


rm(list=ls(all=TRUE)); set.seed(1); library(effects)
# set up data
x <- factor(rep(letters[1:3], each=100))
y <- c(rnorm(100, 3, 3), rnorm(100, 4, 3), rnorm(100, 5, 3))


# fit linear model
m <- summary(lm(y~x)) # no problem

# now the problem
plot(allEffects(m))
# Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  :
#   ‘range’ not meaningful for factors
plot(effect("x", m))
# Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  :
#   ‘range’ not meaningful for factors


Any ideas? It's go to be something superobvious, but I don't get it. Thanks,
RZ

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Working with Data Frames

2015-11-03 Thread Robert Sherry
I have created what I believe to be a data frame. It is called 
env1$SPY.  The r statement head( env1$SPY ) produces the following output:


   SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume SPY.Adjusted
1995-01-03  45.7031  45.8437 45.6875   45.7812 324300 31.55312
1995-01-04  45.9843  46. 45.7500   46. 351800 31.70392
1995-01-05  46.0312  46.1093 45.9531   46.  89800 31.70392
1995-01-06  46.0937  46.2500 45.9062   46.0468 448400 31.73617
1995-01-09  46.0312  46.0937 46.   46.0937  36800 31.76850
1995-01-10  46.2031  46.3906 46.1406   46.1406 229800 31.80082

The above data from was created by the following commands:
library( quantmod )
env1 <- new.env()
getSymbols("SPY", src = 'yahoo', from = '1995-01-01', env = env1, 
auto.assign = T)


Now, what I want to do is to loo through the data look for when the 
month changes. What is the proper way of writing a for loop in

R and access the date field?

Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Count of included observations in sp.correlogram

2015-11-04 Thread Robert U
Dear Rusers,
I’m tryingto figure out what I think is a pretty simple thing for anyone who 
knows about correlograms.I’ve a regular grid (say 5*5 points) with some 
quantity associated to eachpoint (count data). I’m trying to verify whether 
this quantity is regularly /randomly or “clusterdly” distributed on the grid. 
I’ve decided to give a shotto the sp.correlogram {spdep}. 

I first createda grid using cell2nb:  

grid <- cell2nb(5,5)xyc <- attr(grid,"region.id")xy 
<-matrix(as.integer(unlist(strsplit(xyc, ":"))), ncol=2, 
byrow=TRUE)plot(grid,xy) >gridNeighbour list object:Number of regions: 25 
Number of nonzero links: 80 Percentage nonzero weights: 12.8 Average number of 
links: 3.2 I then usedsp.correlogram, and specified “order = 4” since I figured 
the maximum lagbetween 2 points on a 5 by 5 grid is 4… In sp.correlogram we do 
not have tospecify a “style” as in moran.test, not sure why so far… anyway. 

results<- sp.correlogram(grid, data$quantity, order=4, method = "I")  
print(results,"bonferroni") In the “print” tbale, the count ofobservation per 
lag order (in brakets) is 25 for each lag. This is what I donot understand, 
should not this count be changing with lags?  I mean, when you look at the 
graph of “grid” Iwould have expected a lower number for lag 4 (say only 15 
pairs of observationare “that far”) and a way higher number for lag 1… Does 
that make sens toanyone? regards
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] QuantMod and XML

2015-11-07 Thread Robert Sherry


I am trying to use the package quantmod to get option quotes in R. 
Therefore, I executed the following two commands:

library ("quantmod" )
getOptionChain("AAPL")
The first one worked but the second one produced the following error 
message:

Error in getOptionChain.yahoo(Symbols = "AAPL") :
package:“XML”cannot be loaded.
Therefore, I am thinking I need to install the package XML. To do so, I 
executed the following command:

install.packages( "XML" )
However, that command failed because it could not find the package XML. 
The following URL:

https://cran.r-project.org/web/packages/XML/XML.pdf
indicates to me that it does exist.

I am hoping somebody can tell me what I am doing wrong.

Thanks
Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] QuantMod and XML

2015-11-07 Thread Robert Sherry
Thanks for the response. I am currently using Windows 7. I tried the 
following command:
 install.packages("XML", repos = "http://www.omegahat.org/R";)
and I got:
 Installing package into ‘C:/Users/Bob/Documents/R/win-library/3.1’
 (as ‘lib’ is unspecified)
 Warning: unable to access index for repository 
http://www.omegahat.org/R/bin/windows/contrib/3.1

package ‘XML’ is available as a source package but not as a binary

 Warning message: package ‘XML’ is not available (for R version 3.1.2)

I also tried this command:
 install.packages("RSXML", repos = "http://www.omegahat.org/R";)
and I got:
 Installing package into ‘C:/Users/Bob/Documents/R/win-library/3.1’
 (as ‘lib’ is unspecified)
 Warning: unable to access index for repository 
http://www.omegahat.org/R/bin/windows/contrib/3.1
 Warning message:
 package ‘RSXML’ is not available (for R version 3.1.2)

I am wondering why it is not work. Please help.

Thanks
Bob


On 11/7/2015 6:41 PM, Hasan Diwan wrote:
> Bob,
>
> On 7 November 2015 at 15:27, Robert Sherry  <mailto:rsher...@comcast.net>> wrote:
>
>
> I am trying to use the package quantmod to get option quotes in R.
> Therefore, I executed the following two commands:
> library ("quantmod" )
> getOptionChain("AAPL")
> The first one worked but the second one produced the following
> error message:
> Error in getOptionChain.yahoo(Symbols = "AAPL") :
> package:“XML”cannot be loaded.
> Therefore, I am thinking I need to install the package XML. To do
> so, I executed the following command:
> install.packages( "XML" )
> However, that command failed because it could not find the package
> XML. The following URL:
> https://cran.r-project.org/web/packages/XML/XML.pdf
> indicates to me that it does exist.
>
>
> It also shows its webpage to be at http://www.omegahat.org/RSXML. On 
> the root of the site -- http://www.omegahat.org -- the installation 
> command is given as install.packages(packageName, repos = 
> "http://www.omegahat.org/R";). So perhaps, you should try 
> install.packages("XML", repos = "http://www.omegahat.org/R";) or 
> install.packages("RSXML", repos = "http://www.omegahat.org/R";) as one 
> of those two should get you what you want. Hope that helped... -- H
>
>
> I am hoping somebody can tell me what I am doing wrong.
>
> Thanks
> Bob
>
> __
> R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> -- 
> OpenPGP: https://hasan.d8u.us/gpg.key
> Sent from my mobile device
> Envoyé de mon portable


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] not allocate of vactor size

2015-11-22 Thread Robert Sherry
I am thinking that R is running out of memory. Therefore, I would look 
to increase the size of my virtual memory. Here are two links

that might help you with that:
http://windows.microsoft.com/en-us/windows/change-virtual-memory-size#1TC=windows-7
http://www.ehow.com/how_5001512_increase-virtual-memory-linux.html
Bob
On 11/22/2015 10:08 AM, Tamsila Parveen via R-help wrote:

Hello,   Is there anyone to help me out how can I resolve memory issue 
of R, when I want to analyze data of 1Gb file, R returns me Error: not allocate 
of vector size of 1.8 GB.I tried on linux as well as on windows with 64 bit 
system and using 64 bit R-3.2.2 version. So anyone who knows please guide me to 
resolve this issue
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of "file.choose()" or "change working directory" tab causing stall on Mac

2015-12-20 Thread Robert Baer


On 12/19/2015 10:39 PM, Vinny Mo wrote:
> Hello,
>
>
> I used to use the "file.choose()" command quite a lot, as well as the "change 
> working directory" drop down tab as part of my workflow with R, but for over 
> 1 year both of these actions have caused the spinning-wheel to crash R (just 
> R, not any other program).
>
>
> The issue seems to happen when the GUI pops up, and happens about 80% of the 
> time I use either of these actions. This issue has remained constant across 
> different computers (though all macs), different R builds, and different Mac 
> OS's. I had asked about this issue before, and had hoped that this bug might 
> be fixed at some point, but it has persisted.
The posting guide asks for a reproducible example.  This is problematic 
if you only have only an 80% failure rate. Nevertheless, do you have a 
verbatim example that has failed at least once?  If a particular 
formulation fails one time for you, does it always fail or can a certain 
syntax work 1 in 5 times?

If this is a mac-only problem you might get more help on that mailing list:
*https://stat.ethz.ch/mailman/listinfo/r-sig-mac***

You should install the most recent version of R and reproduce the 
problem there.  When posting again, it would be helpful to supply the 
results of
R.Version()   for your setup.

>
> I know I can work around this issue programmatically by typing these commands 
> manually, but both of these features represent a nice function that R has 
> that I'd like to continue to use as was intended. Does anyone have any idea 
> how I might be able to get this functionality back, or if the R Gods have any 
> thoughts about addressing this issue?
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] F Distribution

2015-12-21 Thread Robert Sherry


When I use a table, from a Schaum book, I see that for the 95 
percentile, with v_1 = 1 and v_2 = 1 the value is 161. In the modern era,
looking values up in a table is less than ideal. Therefore, I would 
expect R to have a function to do this and based upon my
reading of the documentation, I would expect the following call to get 
the value I expect:

 pf( .95,1, 1)
However, it produces
0.4918373
Therefore, I conclude that I am using the wrong function. What function 
should I use?


Thanks
Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting a Data Frame

2016-01-22 Thread Robert Sherry

In R, I run the following commands:
df = data.frame( x=runif(10), y=runif(10) )
df2 = df[order(x),]

The first, as I would expect, creates a data frame with two columns and 
10 rows. I expect the second to sort the data based upon
the columns x and produce a new data frame, df2, with the same size as 
df. However, the data frame is produces is much larger.
I do not understand what is going on. I am hoping somebody can help me. 
I am also wondering if I should have a comma after
order(x) in the second statement. I do not see a purpose for it but it 
was in an example on the web.


Thanks
Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help mailing list activity / R-not-help?

2016-01-24 Thread Robert Sherry
I think this mailing list is wonderful and it has helped me a lot. In 
fact, I am not sure I would be using R today if it was not for this

list.

Bob

On 1/24/2016 4:42 PM, Michael Friendly wrote:


On 1/23/2016 7:28 AM, Jean-Luc Dupouey wrote:

Dear members,

Not a technical question:

But one worth raising...


The number of threads in this mailing list, following a long period of
increase, has been regularly and strongly decreasing since 2010, passing
from more than 40K threads to less than 11K threads last year. The trend
is similar for most of the "ancient" mailing lists of the R-project.

[snip ...]


I hope it is the wright place to ask this question. Thanks in advance,



In addition to the other replies, there is another trend I've seen that
has actively worked to suppress discussion on R-help and move it 
elsewhere. The general things:
- R-help was too unwieldy and so it was a good idea to hive-off 
specialized topics to various sub lists, R-SIG-Mac, R-SIG-Geo,

etc.
- Many people posted badly-formed questions to R-help, and so it
was a good idea to develop and refer to the posting guide to mitigate
the number of purely junk postings.


Yet, the trend I've seen is one of increasing **R-not-help**, in that 
there are many posts, often by new R users who get replies that not

infrequently range from just mildly off-putting to actively hostile:

- Is this homework? We don't do homework (sometimes false alarms,
where the OP has to reply to say it is not)
- Didn't you bother to do your homework, RTFM, or Google?
- This is off-topic because XXX (e.g., it is not strictly an R 
programming question).

- You asked about doing XXX, but this is a stupid thing
to want to do.
- Don't ask here; you need to talk to a statistical consultant.

I find this sad in a public mailing list sent to all R-help subscribers
and I sometimes cringe
when I read replies to people who were actually trying to get
help with some R-related problem, but expressed it badly, didn't
know exactly what to ask for, or how to format it,
or somehow motivated a frequent-replier to publicly dis the OP.

On the other hand, I still see a spirit of great generosity among some
people who frequently reply to R-help, taking a possibly badly posed
or ill-formatted question, and going to some lengths to provide a
a helpful answer of some sort.  I applaud those who take the time
and effort to do this.

I use R in a number of my courses, and used to advise students to
post to R-help for general programming questions (not just homework) 
they couldn't solve. I don't do this any more, because several of them

reported a negative experience.

In contrast, in the Stackexchange model, there are numerous sublists
cross-classified by their tags.  If I have a specific knitr, ggplot2, 
LaTeX, or statistical modeling question, I'm now more likely to post 
it there, and the worst that can happen is that no one "upvotes" it

or someone (helpfully) marks it as a duplicate of a similar question.
But comments there are not propagated to all subscribers,
and those who reply helpfully, can see their solutions accepted or not,
or commented on in that specific topic.

Perhaps one solution would be to create a new "R-not-help" list where,
as in a Monty Python skit, people could be directed there to be 
insulted and all these unhelpful replies could be sent.


A milder alternative is to encourage some R-help subscribers to click 
the "Don't send" or "Save" button and think better of their replies.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting a Data Frame

2016-01-26 Thread Robert Sherry


Thank  you for the response. As expected, the following expression worked:
df[order(df$x),]
I would expect the following expression to work also:
df[order(df$x)]
However it does not. That is, the comma is needed. Please tell me why 
the comma is there.


Thanks
Bob
On 1/26/2016 8:19 AM, S Ellison wrote:

On 23.01.2016 01:21, Robert Sherry wrote:

In R, I run the following commands:
  df = data.frame( x=runif(10), y=runif(10) )
  df2 = df[order(x),]

You use another x from your workspace, you actually want to


   df2 = df[order(df[,"x"]),]

or
df[order(df$x),]

And just to prevent yet more confusion, you might also want to avoid 'df' as a 
name. 'df' is the function that returns the density of the F distribution ...

S Ellison



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variable Argument Function

2016-02-07 Thread Robert Sherry


I would like to write a function in R that would take a variable number 
of integers as parameters. I do not have a pressing reason to do this, I 
am just trying to learn R. I thought a good first step would be to print 
out the arguments. So I wrote the following function:


f1 = function (...)
{
list1 = as.list(...)
for( i in 1:length(list1) )
cat( "i is ", list1[[i]], "\n" )
return (0)
}

I ran it as:
f1(2,4,10,12)
and I get:
i is  2
[1] 0
I was hoping for
i is  2
i is  4
i is  10
i is  12

I am hoping somebody can tell me what I am doing wrong. Is using a list 
a bad idea?


Thanks
Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable Argument Function

2016-02-07 Thread Robert Sherry

Ben,

Your solution solved my issue. Thank you. I do not see a need for a 
nested function. Based upon your solution, I came up with

this solution:

fbob = function (...)
{
l1 = list(...)
for( i in 1:length(l1) )
cat( "i is ", l1[[i]], "\n" )
return (0);
}

It does not use nested functions and it works also. Is there a reason 
why your solution is better?


Bob

On 2/7/2016 7:14 PM, Ben Tupper wrote:

Hi,


On Feb 7, 2016, at 6:24 PM, Duncan Murdoch  wrote:

On 07/02/2016 6:12 PM, Robert Sherry wrote:

I would like to write a function in R that would take a variable number
of integers as parameters. I do not have a pressing reason to do this, I
am just trying to learn R. I thought a good first step would be to print
out the arguments. So I wrote the following function:

f1 = function (...)
{
  list1 = as.list(...)

This is wrong.  The ... object is weird; it's not something that can be coerced 
to a list.  However, you can pass it as list(...) and it will give you what you 
were expecting.


Do you mean that Bob should nest a function within f1?  Like this?

f1 = function (...){
f2 <- function(list1){
   for( i in 1:length(list1) ) cat( "i is ", list1[[i]], "\n" )
   return (0)
 }
 f2(list(...))
}

f1(2,4,10,12)


f1(2,4,10,12)

i is  2
i is  4
i is  10
i is  12

Ben



The theory is that it will expand to multiple arguments to the list() function, 
which constructs a list containing them.  as.list() doesn't want a bunch of 
arguments, it will just ignore most of them.

Duncan Murdoch


  for( i in 1:length(list1) )
  cat( "i is ", list1[[i]], "\n" )
  return (0)
}

I ran it as:
  f1(2,4,10,12)
and I get:
  i is  2
  [1] 0
I was hoping for
  i is  2
  i is  4
  i is  10
  i is  12

I am hoping somebody can tell me what I am doing wrong. Is using a list
a bad idea?

Thanks
Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Double AND within an IF statement

2016-02-24 Thread Robert Sherry

This should work:
if ( age > 4 && age < 8 && infection > 0 ) replacement = 2

Bob

On 2/24/2016 7:08 AM, Polychronis KOSTOULAS wrote:


Hi there,

apologies if this is easy. I want to write this condition:

If age is more than 4 years and less or equal to 8 years and infection 
is positive then replacement is 2.


Can you help me the double END?

Thanks,
Polychronis

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression with factor having1 level

2016-03-10 Thread Robert McGehee
Hello R-helpers,
I'd like a function that given an arbitrary formula and a data frame
returns the residual of the dependent variable, and maintains all NA values.

Here's an example that will give me what I want if my formula is y~x1+x2+x3
and my data frame is df:

resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))

Here's the catch, I do not want my function to ever fail due to a factor
with only one level. A one-level factor may appear because 1) the user
passed it in, or 2) (more common) only one factor in a term is left after
na.exclude removes the other NA values.

Here is the error I would get above if one of the terms was a factor with
one level:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Instead of giving me an error, I'd like the function to do just what lm()
normally does when it sees a variable with no variance, ignore the variable
(coefficient is NA) and continue to regress out all the other variables.
Thus if 'x2' is a factor with one variable in the above example, I'd like
the function to return the result of:
resid(lm(y~x1+x3, data=df, na.action=na.exclude))

Can anyone provide me a straight forward recommendation for how to do this?
I feel like it should be easy, but I'm honestly stuck, and my Google
searching for this hasn't gotten anywhere. The key is that I'd like the
solution to be generic enough to work with an arbitrary linear formula, and
not substantially kludgy (like trying ever combination of regressions terms
until one works) as I'll be running this a lot on big data sets and don't
want my computation time swamped by running unnecessary regressions or
checking for number of factors after removing NAs.

Thanks in advance!
--Robert


PS. The Google search feature in the R-help archives appears to be down:
http://tolstoy.newcastle.edu.au/R/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with factor having1 level

2016-03-10 Thread Robert McGehee
Here's an example for clarity:

> df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
x2=factor(c("A","A","A","A","B")))
> resid(lm(y~x1+x2, data=df, na.action=na.exclude)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Note that the x2 factor variable contains two levels, but the "B" level is
excluded in the regression due to the NA value in x1. Hence the error.

Instead of the above error, I would like a function that returns the
residual of the regression without the offending term, which in this case
would be equivalent to:
> resid(lm(y~x1, data=df, na.action=na.exclude)
 1  2  3  4  5
-1  1 -1  1 NA

Note the 5th term returns an NA as there is an NA in the x1 independent
variable, which was what I had meant by maintain NAs.

I'm currently leaning towards rewriting model.matrix.default so that it
removes offending terms rather than give an error, but if someone has done
this already (or something more elegant), that would of course be preferred
:)
--Robert

On Thu, Mar 10, 2016 at 7:39 PM, David Winsemius 
wrote:

>
> > On Mar 10, 2016, at 2:00 PM, Robert McGehee  wrote:
> >
> > Hello R-helpers,
> > I'd like a function that given an arbitrary formula and a data frame
> > returns the residual of the dependent variable,and maintains all NA
> values.
>
> What does "maintains all NA values" actually mean?
> >
> > Here's an example that will give me what I want if my formula is
> y~x1+x2+x3
> > and my data frame is df:
> >
> > resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> >
> > Here's the catch, I do not want my function to ever fail due to a factor
> > with only one level. A one-level factor may appear because 1) the user
> > passed it in, or 2) (more common) only one factor in a term is left after
> > na.exclude removes the other NA values.
> >
> > Here is the error I would get
>
> From what code?
>
>
> > above if one of the terms was a factor with
> > one level:
> > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
> >  contrasts can be applied only to factors with 2 or more levels
>
> Unable to create that error with the actions you decribe but to not
> actually offer in coded form:
>
>
> > dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=TRUE, x3=rnorm(10))
> > lm(y~x1+x2+x3, dfrm)
>
> Call:
> lm(formula = y ~ x1 + x2 + x3, data = dfrm)
>
> Coefficients:
> (Intercept)   x1   x2TRUE   x3
>-0.16274 -0.30032   NA -0.09093
>
> > resid(lm(y~x1+x2+x3, data=dfrm, na.action=na.exclude))
>   1   2   3   4   5   6
> -0.16097245  0.65408508 -0.70098223 -0.15360434  1.26027872  0.55752239
>   7   8   9  10
> -0.05965653 -2.17480605  1.42917190 -0.65103650
>
> >
>
>
> > Instead of giving me an error, I'd like the function to do just what lm()
> > normally does when it sees a variable with no variance, ignore the
> variable
> > (coefficient is NA) and continue to regress out all the other variables.
> > Thus if 'x2' is a factor with one variable in the above example, I'd like
> > the function to return the result of:
> > resid(lm(y~x1+x3, data=df, na.action=na.exclude))
> > Can anyone provide me a straight forward recommendation for how to do
> this?
> > I feel like it should be easy, but I'm honestly stuck, and my Google
> > searching for this hasn't gotten anywhere. The key is that I'd like the
> > solution to be generic enough to work with an arbitrary linear formula,
> and
> > not substantially kludgy (like trying ever combination of regressions
> terms
> > until one works) as I'll be running this a lot on big data sets and don't
> > want my computation time swamped by running unnecessary regressions or
> > checking for number of factors after removing NAs.
> >
> > Thanks in advance!
> > --Robert
> >
> >
> > PS. The Google search feature in the R-help archives appears to be down:
> > http://tolstoy.newcastle.edu.au/R/
>
> It's working for me.
>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with factor having1 level

2016-03-11 Thread Robert McGehee
Hi,
In case this is helpful for anyone, I think I've coded a satisfactory
function answering my problem (of handling formulas containing 1-level
factors) by hacking liberally at the model.matrix code to remove any
model terms for which the contrast fails. As it's a problem I've come
across a lot (since my data frames have factors and lots of missing
values), adding support for 1-level factors might be a nice item for
the R Wishlist. I suppose a key question is, does anyone ever _want_
to see the error "contrasts can be applied only to factors with 2 or
more levels", or should the contrasts function just add a column of
all zeros (or ones) to the design matrix and let the modelling
functions handle that the same way it does any other zero-variance
term?

Anyway, my function below:

lmresid <- function(formula, data) {
mf <- model.frame(formula, data=data, na.action=na.exclude)
omit <- attr(mf, "na.action")
t <- terms(mf)
contr.funs <- as.character(getOption("contrasts"))
namD <- names(mf)
for (i in namD) if (is.character(mf[[i]]))
mf[[i]] <- factor(mf[[i]])
isF <- vapply(mf, function(x) is.factor(x) || is.logical(x), NA)
isF[1] <- FALSE
isOF <- vapply(mf, is.ordered, NA)
for (nn in namD[isF])
if (is.null(attr(mf[[nn]], "contrasts"))) {
noCntr <- try(contrasts(mf[[nn]]) <- contr.funs[1 +
isOF[nn]], silent=TRUE)
if (inherits(noCntr, "try-error")) {   # Remove term
from model on error
mf[[nn]] <- NULL
t <- terms(update(t, as.formula(paste("~ . -", nn))), data=mf)
}
}
ans <- .External2(stats:::C_modelmatrix, t, mf)
r   <- .lm.fit(ans, mf[[1]])$residual
stats:::naresid.exclude(omit, r)
}

## Note that lmresid now returns the same values as resid with the
## 1-level factor removed.
df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
x2=factor(c("A","A","A","A","B")))
lmresid(y~x1+x2, data=df)
resid(lm(y~x1, data=df, na.action=na.exclude))

--Robert

PS, Peter, wasn't sure if you also meant to add comments, but they
didn't come through.


On Fri, Mar 11, 2016 at 3:40 AM, peter dalgaard  wrote:
>
>> On 11 Mar 2016, at 02:03 , Robert McGehee  wrote:
>>
>>> df <- data.frame(y=c(0,2,4,6,8), x1=c(1,1,2,2,NA),
>> x2=factor(c("A","A","A","A","B")))
>>> resid(lm(y~x1+x2, data=df, na.action=na.exclude)
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Open source project that needs performance optimizations

2016-03-27 Thread Robert Sherry
I am not up on the internals of R but there does seem some run for 
parallelism. Are we talking about special hardware? or running this on 
an Intel Box? If it is the second, then I am thinking threads would be 
the way to go. Please consider the following

R statements:
for( i in 1:30 )a[i] = f1(i)
Would it make sense to  setup a separate thread for each call to f1?  I 
think it in most cases, the answer is no but on some machines and 
depending on the running time of f1, it could be a big win. Also, does 
the user have to change his code, or would R be
smart enough to do the work behind the scenes. I consider the second to 
be significantly better than the first.


You may also want to look at the following URL
http://stackoverflow.com/questions/1395309/how-to-make-r-use-all-processors

Bob

On 3/27/2016 11:52 AM, PSATHAS NILOS-HRISTOS wrote:

Hello,
i am an undergraduate student on computer engineering and im 
considering to do my thesis to an open source project and make 
performance optimizations and/or add parallelism to it where possible 
(or even better make use of GPU). Do you think that R-project is a 
good candidate?


Thanks,
Psathas Neilos

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to speed up my program

2016-04-01 Thread Robert Sherry

Hi Ragia,

First, when you wrote mad, I assume you mean made.  Also, when you say 
it is a multi core prog, does that mean it is using threads? running two 
or more items in parallel? By any chance are you using this package?

https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf
If not, maybe you should. There is also a new version of R called pqR. 
It is multi thread and that maybe exactly what you need.


Also R is interpreted not compiled. Therefore if speed is important 
rewriting it in a compiled language like C or C++ could be a whole lot 
faster. I suspect that this would also be a lot of work and probably not 
worth it.


Bob

On 4/1/2016 4:01 AM, Ragia . wrote:


Dear group
I had a R   program that was to slow, I mad it multi core prog..to speed up, 
its a simulation when the runs are 100 its very fast..raising the runs to 10k 
mad it in the first fast then it slow down
I checked the HW usafe and here is the top command results


%Cpu0  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  65863948 total, 13940104 used, 51923844 free,   231084 buffers
KiB Swap:  1046520 total,0 used,  1046520 free.  4418180 cached Mem

what should I do to speed it up?
thanks in advance   
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert matrix

2016-10-30 Thread Robert Baer



On 10/29/2016 11:19 AM, Elham - via R-help wrote:

Dear Madam / Sir,I saw this function for "Convert to matrix as it is that you wanted" 
> test2<-as.matrix(test1)

colnames(test2)<-NULL
genelist<-c("Fkh2","Swi5","Sic1")
rownames(test2)<-genelist
test2
#  [,1]  [,2]  [,3]
#Fkh2 0.141 0.242 0.342
#Swi5 0.224 0.342 0.334
#Sic1 0.652 0.682 0.182


what is function for large data?my data and genelist are 28031 rows,how can I convert? clear that I can not 
write 28031 genes like genelist<-c("Fkh2","Swi5","Sic1")
You can assign the names of your genes by any method convenient. The 
point is not necessarily to use the c() function.   If you have them in 
a .csv file somewhere, simply read them in to create the genelist vector.


If you do not know how to to this you should probably read, "An 
Introduction to R", 
https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf


?read.csv typed at the command prompt will give you some specifics

In the end, you will do something like:
genelist  <- read.csv("AfileOfMyGenes.csv", header = TRUE)

# assume your gene names are in the first column which is titled genename
genelist <- genelist$genename





Your attention would be really appreciated.Best Regards,Elham Dalalbshi Esfahani

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Run a Python code from R

2016-11-19 Thread Robert Baer
 From https://www.r-bloggers.com/rpithon-vs-rpython/

"Similar to rPython, the 
rPithon package (http://rpithon.r-forge.r-project.org) allows users to 
execute Python code from R and exchange the data between Python and R. 
However, the underlying mechanisms between these two packages are 
fundamentally different. Wihle rPithon communicates with Python from R 
through pipes, rPython accomplishes the same task with json. A major 
advantage of rPithon over rPython is that multiple Python processes can 
be started within a R session. However, rPithon is not very robust while 
exchanging large data objects between R and Python."


On 11/16/2016 7:10 PM, David Winsemius wrote:
>> On Nov 16, 2016, at 4:53 PM, Nelly Reduan  wrote:
>>
>> Thank you very much for your help !
>>
>>
>> I 'm trying to use the package "rPithon" but I obtain this error message:
> Are you sure you are not just misspelling rPython? If that's not the issue 
> than you need to say where you got rPithon,

 From https://www.r-bloggers.com/rpithon-vs-rpython/

"Similar to rPython, the rPithon package 
(http://rpithon.r-forge.r-project.org) allows users to execute Python 
code from R and exchange the data between Python and R. However, the 
underlying mechanisms between these two packages are fundamentally 
different. While rPithon communicates with Python from R through pipes, 
rPython accomplishes the same task with json. A major advantage of 
rPithon over rPython is that multiple Python processes can be started 
within a R session. However, rPithon is not very robust while exchanging 
large data objects between R and Python."


>> On Wed, Nov 16, 2016 at 4:53 PM, Nelly Reduan  wrote:
>>> Hello,
>>>
>>>
>>> How can I run this Python code from R ?
>>>
>>>
>> import nlmpy
>> nlm = nlmpy.mpd(nRow=50, nCol=50, h=0.75)
>> nlmpy.exportASCIIGrid("raster.asc", nlm)
>>>
>>> Nlmpy is a Python package to build neutral landscape models
>>>
>>> https://pypi.python.org/pypi/nlmpy . The example comes from this website. I 
>>> tried to use the function system2 but I don't know how to use it.
>>>
>>>
>>> path_script_python <- "C:/Users/Anaconda2/Lib/site-packages/nlmpy/nlmpy.py"
>>>
>>> test <- system2("python", args = c(path_script_python, as.character(nRow), 
>>> as.character(nCol), as.character(h)))
>>>
>>> Thanks a lot for your help.
>>> Nell
>>>
>>>
>>> nlmpy 0.1.3 : Python Package Index
>>> pypi.python.org
>>> NLMpy. NLMpy is a Python package for the creation of neutral landscape 
>>> models that are widely used in the modelling of ecological patterns and 
>>> processes across ...
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stacked and dodged bar graph ggplot

2016-12-30 Thread Robert Lynch
I have some census data with race and ethnicity for various towns.  I am
trying to make a stacked bar graph where all the race data is in one
stacked bar, and all the ethnicity data is in another.

Below is a minimal reproducible sample.

library("ggplot2")

Demog <-
data.frame(source=c(rep("Davis",4),rep("Dixon",4),rep("Winters",4)),
group =c("Asian / Pacific Islander","Caucasian","Lantinx","Not
Latinx","African American", "Native American", "Latinx", "Not
Latinx","Mixed race","Other","Latinx", "Not Latinx"),
number =c(14491, 42571, 8172, 57450, 562, 184, 7426, 10952,
332, 1488, 3469, 3155),
field = rep(c(rep("race",2),rep("ethnicity",2)),3))

Demog$race <- factor(Demog$group, levels=c("Asian / Pacific Islander",
"Caucasian", "African American", "Native American / Alaska Native",  "mixed
race",  "other"))
Demog$ethn <- factor(Demog$group, levels=c("Latinx","not latinx"))
Demog$location <- factor(Demog$source, levels=c( "Dixon",
"Winters","Davis"))
Demog.bar1 <-ggplot(data = Demog, aes(x = location, y = number, fill =
race))+theme_bw() +geom_bar(stat = "identity",position = "stack") +
coord_flip()

Demog.bar2 <-ggplot(data = Demog, aes(x = location, y = number, fill =
ethn))+theme_bw() +geom_bar(stat = "identity",position = "stack") +
coord_flip()

show(Demog.bar1)
show(Demog.bar2)




Much thanks,
Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop in R

2017-01-12 Thread Robert Sherry
I only see one for loop in your code. I am wondering if you want a 
second for loop based upon the length of newdata.


I would also think that you do not need the second call to set.seed.

Bob

On 1/12/2017 4:44 PM, Jennifer Sheng wrote:

Dear friends,  I am working on a double loop using for.  One level of loop
is to predict N times for each subject, and the second level is to predict
M times for the every subject, one subject after one subject.   Please note
every subject have different N or M rows of data.   Any advice?  Thank you
so much!

Below is the current code:

set.seed (123)   ## for consistent result;

ND <- S004Cmin[S004Cmin$ID %in% c(1:10),]   # define the first 10 subjects

predSurv <- vector("list", nrow(ND))

for (i in 1:nrow(ND)) {

   set.seed(123)

   predSurv[[i]] <- survfitJM(fitJOINT.NULL, newdata = ND[1:i, ],
idVar="USUBJID")

   }



Thank you very much!

Jenny

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test

2017-01-16 Thread Robert Piliero
-- 

Robert J. Piliero

Cell: (617) 283 1020
38 Linnaean St. #6
Cambridge, MA, 02138
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Receiving NaN message

2017-01-16 Thread Robert Piliero
Hello,

I am working on a Coursera assignment and have combined 332 files into a
single data frame called "dat". The dataframe has 4 columns,

1. Date
2. Sulfate (numerical values)
3. Nitrate  (numerical  values)
4. ID # (numerical values).

Our assignment is to write a function pollutantmean <- function(directory,
pollutant, ID). whereby we can calculate the mean by inputting the
pollutant name and ID #.

I have reached the stage of subsetting the date e.g. by ID # 1-10, however
when I do so and then calculate the mean of this subset I receive the NaN
message (even though I have instructed R to disregard the "NA"'s).


*Beginning Code: *
getwd()
read.csv(specdata)
specdata <- ("C:/Users/rober/specdata")
list.files(specdata)
files_full <- list.files(specdata, full.names=TRUE)
files_full
dat <- data.frame()
for (i in 1:332){
  dat <- rbind(dat,read.csv(files_full[i]))
}
str(dat)
mean(dat$sulfate, na.rm=TRUE)

*Code which generated the NaN message. *
dat1_10 <- dat[which(dat[,ID] ==1:10),]
mean(dat1_10$sulfate, na.rm=TRUE)

Am I making a mistake in subsetting the rows with ID's 1:10? Any advice
would be appreciated.

Thank you,

Robert

Robert J. Piliero

Cell: (617) 283 1020
38 Linnaean St. #6
Cambridge, MA, 02138
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2017-01-30 Thread Robert Sherry
Here is one thought. Assign each month a value of 0, 1 or 2. Then do a 
simple linear regression analysis where the value of the month
is the independent variable. You can also do multiple linear regression 
with the value you assigned to the month plus the other factors that
you believe are causing a change to your data. Time is the one that 
comes to my mind. You can do this with the standard R function lm.


I hope this helps.

Bob

On 1/30/2017 9:11 AM, Kwesi Quagraine wrote:

Hello, I have a data with two variables nodes and index, I want to extract
3 months seasons, with a shift of 1 month, that is, DJF, JFM, FMA etc to
OND. Was wondering how to go about it. Kindly find data sample below, data
is in csv format.
Any help will be appreciated.

My data sample;

   era...1.Node_freq   MEI
1   1980-01-01 -0.389855332  0.3394196488
2   1980-02-01 -0.728019153  0.2483738232
3   1980-03-01 -1.992457784  0.3516954904
4   1980-04-01  0.222760284  0.5736836269
5   1980-05-01  0.972601798  0.6289249144
6   1980-06-01  0.570725954  0.5736836269
7   1980-07-01 -0.977966324  0.4120517119
8   1980-08-01  0.056128836 -0.0104418383
9   1980-09-01  0.987304573 -0.0687520861
10  1980-10-01  1.188242495 -0.1403611624
11  1980-11-01  1.693037763 -0.0963727298
12  1980-12-01  1.173539720 -0.2539126977
13  1981-01-01  0.423698206 -0.6140040528
14  1981-02-01 -2.208098481 -0.5209122536
15  1981-03-01 -0.786830252  0.1133395650
16  1981-04-01 -0.110502611  0.3302127675
17  1981-05-01 -1.272021820 -0.1894645290
18  1981-06-01  0.394292656 -0.3736021538
19  1981-07-01  1.452892441 -0.4032687711
20  1981-08-01  0.698150002 -0.4441882433
21  1981-09-01  0.997106423 -0.1720737534
22  1981-10-01  0.247264908 -0.2436828296
23  1981-11-01  0.771663876 -0.3909929295
24  1981-12-01 -0.316341458 -0.4943145967

Regards,
​Kwesi​



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lines those not started with "rs"

2017-01-30 Thread Robert Sherry

Greg,

I am assuming that your data is in a text file. R is a good tool but not 
the tool I would use for this job. The tool I would

use is grep. The following command should get you want you want:
 grep -v "^rs" 

Bob

On 1/30/2017 9:23 AM, greg holly wrote:

Hi all;

I have a file which has about 3.000.000 lines. Most of the lines at first
column start with "rs", for example, rs1056, rs1076 and so on. I
would like to get the lines which do not start with "rs" . Your helps
highly appreciated.

Regards,

Greg

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lines those not started with "rs"

2017-01-30 Thread Robert Sherry
then my solution should work.

Bob
On 1/30/2017 9:44 AM, greg holly wrote:
> Hi Robert;
>
> I do appreciate your advice. Only the first column of the data is 
> text. The rest columns are numeric.
>
> Regards,
>
> Greg
>
> On Mon, Jan 30, 2017 at 9:36 AM, Robert Sherry  <mailto:rsher...@comcast.net>> wrote:
>
> Greg,
>
> I am assuming that your data is in a text file. R is a good tool
> but not the tool I would use for this job. The tool I would
> use is grep. The following command should get you want you want:
>  grep -v "^rs" 
>
> Bob
>
>
> On 1/30/2017 9:23 AM, greg holly wrote:
>
> Hi all;
>
> I have a file which has about 3.000.000 lines. Most of the
> lines at first
> column start with "rs", for example, rs1056, rs1076
> and so on. I
> would like to get the lines which do not start with "rs" .
> Your helps
> highly appreciated.
>
> Regards,
>
> Greg
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org <mailto:R-help@r-project.org> mailing
> list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] List raster files

2017-03-05 Thread Robert Baer

On 3/4/2017 7:54 AM, Tomás Pérez C. wrote:

I am working with raster images of modis of the satellites aqua and terra
and I need to combine the images by its day and year (originally in Julian
day). However, for the earth I have 6031 images and for aqua 5277. I want
to know how to create an object that selects the images for both folders
with their same date and then create a loop after.
Thank you
You will need to provide more information on how where date information 
is stored and how the files are organized to get any useful response.  
If you are using file time stamps, I'm guessing some system info might 
be useful.  Please read the posting guide and see what you can do to 
help the list help you.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Presentation Quality Tables, e.g., Ten rows, Five columns, with nice headers

2017-03-26 Thread Robert Baer
Quite nice Jim.  A little par() magic, some well thought plot window 
dimensions, and good to go.


I wasn't looking, but now that I've seen it, I can imagine uses.

Bruce - see also 
https://cran.r-project.org/web/packages/xtable/vignettes/xtableGallery.pdf


On 3/26/2017 4:28 PM, Jim Lemon wrote:

Hi Bruce,
Well, a start might be:

bdf<-data.frame(Pre=sample(10:20,10),
  During=sample(8:18,10),
  EOT=sample(5:15,10),fu3mo=sample(7:17,10),
  fu6mo=sample(10:20,10))
rownames(bdf)<-paste("S",1:10,sep="")
plot.new()
library(plotrix)
addtable2plot(0.15,0.2,bdf,display.rownames=TRUE,
  bty="o",vlines=TRUE,hlines=TRUE,title="My table")

Jim


On Mon, Mar 27, 2017 at 4:16 AM, MyCalendar  wrote:

Hi R'ers:
After browsing for a good package for quality table construction, I found 
nothing.
Any advice?
Thanks
Bruce

---
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dividing by 0

2008-07-24 Thread Robert Baer


I'm trying to calculate the percent change for a time-series variable.
Basically the first several observations often look like this,

x <- c(100, 0, 0, 150, 130, 0, 0, 200, 0)

and then later in the life of the variable they're are generally no more
0's.  So when I try to calculate the percent change from one observation 
to
the next, I end up with a lot of NA/Nan/INF, and sometimes 0's which is 
what

I want, in the beginning.

I know I can use x <- na.omit(x), and other forms of this, to get rid of
some of these errors.  But I would rather use some kind of function that
would by defult give a 0 while dividing by zero so that I don't lose the
observation, which is what happens when I use na.omit.



Well, this is not an error but proper behavior in the world of math that I 
know.


However, to get what you want you could try
x=(100-0)/0
if(!is.finite(x))x=0
x

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plink bed files

2008-07-25 Thread Robert Gentleman

Hi Frederico,
  You will likely have more luck asking about Bioinformatics related 
questions on the Bioconductor list,


you might look at the rtracklayer package which can import bed files, if 
they really are bed files, and otherwise you would have to get the file 
format and write your own parser.


Robert


Federico Calboli wrote:

Hi All,

does anyone know how to import binary .bed files generated by Plink (http://pngu.mgh.harvard.edu/~purcell/plink/ 
) into R? the Plink FAQ explains how to conver other types of files,  
not the .bed.


Cheers,

Federico


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
[EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with longitudinal data plot

2008-08-07 Thread Robert Terwilliger
Dear R Help,

I am attempting to make a plot of longitudinal data, a sample data
frame of which is shown below.

I'd like to show all of the subjects in the same plot, with a set of
connecting line segments for each subject. 'age' would be the x-axis
and 'score' would be the y-axis.

subject age  score
1 10123  12  51.06
2 10123  14  50.00
3 10123  15  62.22
4 10124  12  74.42
5 10124  13  72.73
6 10124  14  63.41
7 10125  16  54.55
8 10125  17  50.00
9 10125  18  54.35
1010128  17  97.83
1110128  18  97.87
1210128  19 100.00
...

Any help would be appreciated.

Regards,

Robert Terwilliger

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple (?) subset problem

2008-08-14 Thread Farley, Robert
I can't figure out the syntax I need to get subset to work.  I'm trying
to split my dataframe into two parts.  I'm sure this is a simple issue,
but I'm stumped.  I either get all or none of the original "rows".  

 

 

 

 

> XTTable <- xtabs( ~   direction_ , SurveyData)

> XTTable

direction_

EASTBOUND  

   345 

WESTBOUND  

   307 

> EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )

> XTTable <- xtabs( ~   direction_ , EBSurvey)

> XTTable

direction_

EASTBOUND  

 0 

WESTBOUND  

 0 

> EBSurvey <- subset(SurveyData, direction_ = "EASTBOUND" )

> XTTable <- xtabs( ~   direction_ , EBSurvey)

> XTTable

direction_

EASTBOUND  

   345 

WESTBOUND  

   307 

> EBSurvey <- subset(SurveyData, direction_ == 1 )

> XTTable <- xtabs( ~   direction_ , EBSurvey)

> XTTable

direction_

EASTBOUND  

 0 

WESTBOUND      

 0 

> 

 

 

 

 

Robert Farley

Metro

1 Gateway Plaza

Mail Stop 99-23-7

Los Angeles, CA 90012-2952

Voice: (213)922-2532

Fax:(213)922-2868

www.Metro.net 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple (?) subset problem

2008-08-14 Thread Farley, Robert
BINGO!


> str(SurveyData$direction_)
 Factor w/ 2 levels "EASTBOUND
",..: 1 1 1 1 2 2 1 1 2 1 ...
> levels(SurveyData$direction_)
[1] "EASTBOUND " "WESTBOUND
"
>


Was my mistake in how I read the data?

SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)



That brings up 2 more questions:

How do I "trim" the factor names? {or read them correctly}

How do I write to a list the names of factors? {I have another factor
with ~15 "levels" and I'm a REALLY poor typist}



Thanks!



 
Robert Farley
Metro
www.Metro.net 
 
-Original Message-
From: Erik Iverson [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 14, 2008 15:47
To: Farley, Robert
Cc: r-help@r-project.org
Subject: Re: [R] Simple (?) subset problem

I can't tell exactly what's wrong, just check out the ?str and ?levels 
functions for some guidance.

Farley, Robert wrote:
> I can't figure out the syntax I need to get subset to work.  I'm
trying
> to split my dataframe into two parts.  I'm sure this is a simple
issue,
> but I'm stumped.  I either get all or none of the original "rows".  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>> XTTable <- xtabs( ~   direction_ , SurveyData)
> 
>> XTTable
> 
> direction_
> 
> EASTBOUND  
> 
>345 
> 
> WESTBOUND  
> 
>307 
> 
>> EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )
> 
>> XTTable <- xtabs( ~   direction_ , EBSurvey)
> 
>> XTTable
> 
> direction_
> 
> EASTBOUND  
> 
>  0 
> 
> WESTBOUND  
> 
>  0 
> 
>> EBSurvey <- subset(SurveyData, direction_ = "EASTBOUND" )
> 
>> XTTable <- xtabs( ~   direction_ , EBSurvey)
> 
>> XTTable
> 
> direction_
> 
> EASTBOUND  
> 
>345 
> 
> WESTBOUND  
> 
>307 
> 
>> EBSurvey <- subset(SurveyData, direction_ == 1 )
> 
>> XTTable <- xtabs( ~   direction_ , EBSurvey)
> 
>> XTTable
> 
> direction_
> 
> EASTBOUND  
> 
>  0 
> 
> WESTBOUND  
> 
>  0 
> 
> 
>  
> 
>  
> 
>  
> 
>  
> 
> Robert Farley
> 
> Metro
> 
> 1 Gateway Plaza
> 
> Mail Stop 99-23-7
> 
> Los Angeles, CA 90012-2952
> 
> Voice: (213)922-2532
> 
> Fax:(213)922-2868
> 
> www.Metro.net 
> 
>  
> 
>  
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survey Design / Rake questions

2008-08-18 Thread Farley, Robert
I'm trying to learn how to calibrate/postStratify/rake survey data in
preparation for a large survey effort we're about to embark upon.  As a
working example, I have results from a small survey of ~650 respondents,
~90 response fields each.  I'm trying to learn how to (properly?) apply
the aforementioned functions.  

 

 

My data are from a bus on board survey.  The expansion in the dataset is
derived from three elements:

  Response rates by bus stop for a sampled run

  Total runs/samples runs 

  Normalized to (separately derived) daily line boarding  

  

  

In order to get to the point of raking the data, I need to learn more
about the survey package and nomenclature.  For instance, given how I've
described the survey/weighting, is my call to svydesign correct?  I'm
not sure I understand just what a "survey design" is.  Where can I read
up on this?  What's a good reference for such things as "PSUs", "cluster
sampling", and so on.  I've tried the following code, which fails:

 

 

> SurveyData <- read.spss("C:/Data/R/orange_delivery.sav",
use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)

>
#===


> temp <- sub(' +$', '', SurveyData$direction_) 

> SurveyData$direction_ <- temp

>
#===


>
SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyDat
a$lineoff))

> EBSurvey <- subset(SurveyData, direction_ == "EASTBOUND" )

> XTTable <- xtabs(~direction_ , EBSurvey)

> XTTable

direction_

EASTBOUND 

  345 

> WBSurvey <- subset(SurveyData, direction_ == "WESTBOUND" )

> XTTable <- xtabs(~direction_ , WBSurvey)

> XTTable

direction_

WESTBOUND 

  307 

> #

> EBDesign <- svydesign(id=~sampn, weights=~expwgt, data=EBSurvey)

> #   svytable(~lineon+lineoff, EBDesign)

> OnLabels<- c( "Warner Center", "De Soto", "Pierce College",
"Tampa", "Reseda", "Balboa", "Woodley", "Sepulveda", "Van Nuys",
"Woodman", "Valley College", "Laurel Canyon", "North Hollywood")

> EBOnNewTots <- c(1000,   600, 1200,
500, 1000,  500,   200, 250,   1000,   300,
100,  50,73.65 )

> EBNumStn <- c(673.65, 800, 1000, 1000,  800,  700,  600, 500, 400,
200,  50, 50 )

> ByEBOn <- data.frame(OnLabels,EBOnNewTots)

> ByEBNum <- data.frame(c(1:12),EBNumStn)

> RakedEBSurvey <- rake(EBDesign, list(~ByEBOn, ~ByEBNum),
list(EBOnNewTots, EBNumStn ) )

Error in model.frame.default(margin, data = design$variables) : 

  invalid type (list) for variable 'ByEBOn'

> 

 

 

 

Robert Farley

Metro

1 Gateway Plaza

Mail Stop 99-23-7

Los Angeles, CA 90012-2952

Voice: (213)922-2532

Fax:(213)922-2868

www.Metro.net 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survey Design / Rake questions

2008-08-18 Thread Farley, Robert
Thank you for the list of references.  Do you know of any "free"
references available online?  I'll have to find my library card :-)


My motivation is to try to correct for a "time on board" bias we see in
our surveys.  Not surprisingly, riders who are only on board a short
time don't attempt/finish our survey forms.  We're able to weight our
survey to the "bus stop-on by bus run" level.  I want to keep that, and
rake on new (imposed?) marginals, like an estimate of how many minutes
they were on-board derived from their origin-destination.  In practice,
we'll have thousands of observations on hundreds of runs.  As I see it,
my work-plan involves: 

Running rake successfully on test data
Preparing "bus stop-on by run" marginals automatically
Plus any other "pre-existing" marginals to be kept.
Appending "time on bus" estimates
Determining the "time on bus" distribution (second survey?)
Implementing the raking adjustment for a production (large)
dataset


As of yet, I cannot get the first step to work  :-(



I hope there are no "fatal flaws" in this concept




 
Robert Farley
Metro
www.Metro.net 
 
-Original Message-----
From: Stas Kolenikov [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 18, 2008 10:32
To: Farley, Robert
Cc: r-help@r-project.org
Subject: Re: [R] Survey Design / Rake questions

Your reading, in increasing order of difficulty/mathematical details,
might be Lohr's "Sampling"
(http://www.citeulike.org/user/ctacmo/article/1068825), Korn &
Graubard's "Health Surveys"
(http://www.citeulike.org/user/ctacmo/article/553280), and Sarndal et.
al. Survey Math Bible
(http://www.citeulike.org/user/ctacmo/article/716032). You certainly
should try to get a hold of the primary concepts before collecting
your data (or rather before designing your survey... so it might
already be too late!). Post-stratification is not that huge topic, for
some reason; a review of mathematical details is given by Valliant
(1993) (http://www.citeulike.org/user/ctacmo/article/1036976). On
raking, the paper on top of Google Scholar search by Deville, Sarndal
and Sautory (1993)
(http://www.citeulike.org/user/ctacmo/article/3134001) is certainly
coming from the best people in the field.

I am not aware of general treatment of transportation survey sampling,
although I suspect such references do exist in transportation
research. There might be particular twists as the same subject/bus
usage episode might be sampled at different locations.

As far as rake() procedure is concerned, you need to have your data
set up as sampled observations with two classifications across which
you will be raking, probably the directions "E"/"W" and the stations.
Those are not different data.frames, as you are trying to set them up,
but a single data.frame with several columns. In other words, your
sampled data will have labels "E"/"W" in one of the columns, and
station names in another column, and (the names of) those columns will
be the imputs of rake().

On 8/18/08, Farley, Robert <[EMAIL PROTECTED]> wrote:
> I'm trying to learn how to calibrate/postStratify/rake survey data in
>  preparation for a large survey effort we're about to embark upon.  As
a
>  working example, I have results from a small survey of ~650
respondents,
>  ~90 response fields each.  I'm trying to learn how to (properly?)
apply
>  the aforementioned functions.
>
>  My data are from a bus on board survey.  The expansion in the dataset
is
>  derived from three elements:
>
>   Response rates by bus stop for a sampled run
>
>   Total runs/samples runs
>
>   Normalized to (separately derived) daily line boarding
>
>  In order to get to the point of raking the data, I need to learn more
>  about the survey package and nomenclature.  For instance, given how
I've
>  described the survey/weighting, is my call to svydesign correct?  I'm
>  not sure I understand just what a "survey design" is.  Where can I
read
>  up on this?  What's a good reference for such things as "PSUs",
"cluster
>  sampling", and so on.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   >