from:"Eric Berger"

Re: [R] as.data.frame doesn't set col.names

2017-10-25 Thread Eric Berger

Hi Peter,
Thanks for contributing such a great answer. Can you please provide a
pointer to the documentation where it explains why dd$B <- s and dd["B"] <-
s have such different behavior?

(I am perfectly happy if you write the explanation but if it saves you time
to point to some reference that works fine for me.)

Regards,
Eric


On Wed, Oct 25, 2017 at 2:27 PM, Peter Dalgaard  wrote:

>
> > On 24 Oct 2017, at 22:45 , David L Carlson  wrote:
> >
> > You left out all the most important bits of information. What is yo? Are
> you trying to assign a data frame to a single column in another data frame?
> Printing head(samples) tells us nothing about what data types you have,
> especially if the things that look like text are really factors that were
> created when you used one of the read.*() functions. Use str(samples) to
> see what you are dealing with.
>
> Actually, I think there is enough information to diagnose this. The main
> issue is as you point out, assignment of an entire data frame to a column
> of another data frame:
>
> > l <- letters[1:5]
> > s <- as.data.frame(sapply(l,toupper))
> > dput(s)
> structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")
>
> (incidentally, setting col.names has no effect on this; notice that it is
> only documented as an argument to "list" and "matrix" methods, and sapply()
> returns a vector)
>
> Now, if we do this:
>
> > dd <- data.frame(A=l)
> > dd$B <- s
>
> we end up with a data frame whose B "column" is another data frame
>
> > dput(dd)
> structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d",
> "e"), class = "factor"), B = structure(list(`sapply(l, toupper)` =
> structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")), .Names = c("A",
> "B"), row.names = c(NA, -5L), class = "data.frame")
>
> in printing such data frames, the inner frame "wins" the column names,
> which is sensible if you consider what would happen if it had more than one
> column:
>
> > dd
>   A sapply(l, toupper)
> 1 a  A
> 2 b  B
> 3 c  C
> 4 d  D
> 5 e  E
>
> To get the effect that Ed probably expected, do
>
> > dd <- data.frame(A=l)
> > dd["B"] <- s
> > dd
>   A B
> 1 a A
> 2 b B
> 3 c C
> 4 d D
> 5 e E
>
> (and notice that single-bracket indexing is crucial here)
>
> -pd
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encountered a fatal error. The session was terminated. + * caught illegal operation *

2017-10-26 Thread Eric Berger

How about going back to earlier versions if you don't need the latest ones?

On Thu, Oct 26, 2017 at 12:59 PM, Klaus Michael Keller <
klaus.kel...@graduateinstitute.ch> wrote:

> Dear all,
>
> I just installed the "Short Summer" R update last week. Now, my R Studio
> doesn't open anymore!
>
> --> R encountered a fatal error.  The session was terminated.
>
> and my R terminal doesn't close properly
>
> --> *** caught illegal operation ***
>
> I restarted my Mac OS Sierra 10.12.6 and reinstalled both R 3.4.2 and the
> latest R studio but the problem persists.
>
> How can that issue be solved?
>
> Thanks in advance for your a precious help!
>
> All the best from Switzerland,
>
> Klaus
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] My function and NA Values Problem

2017-10-27 Thread Eric Berger

na.rm=TRUE  (you need to capitalize)


On Fri, Oct 27, 2017 at 10:43 AM, Engin YILMAZ 
wrote:

> Dear R Staff
>
> My working file is in the annex. "g1.csv"
> I have only 2 columns. Rice and coke.
> I try to execute following(below) function, but do not work.
> Because "Coke" value has NA values.
> I try to add "na.rm=True" to the function but do not work
> How can I solve this problem with this function or another algorithm?
> (Note: I have normally 450 columns)
>
> Sincerely
> Engin YILMAZ
>
>
> apply(g1, 2, function(c) sum(c==0))
>
> Rice Coke
>0   NA
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Count non-zero values in excluding NA Values

2017-10-29 Thread Eric Berger

If one does not need all the intermediate results then after defining data
just one line:

grand_total <- nrow(data)*ncol(data) - sum( sapply(data, function(x) sum(
is.na(x) | x == 0 ) ) )
# 76




On Sun, Oct 29, 2017 at 2:38 PM, Rui Barradas  wrote:

> Hello,
>
> Your attachment didn't came through, R-Help strips off most types of
> files, including CSV.
> Anyway, the following will do what I understand of your question. Tested
> with a fake dataset.
>
>
> set.seed(3026)# make the results reproducible
> data <- matrix(1:100, ncol = 10)
> data[sample(100, 15)] <- 0
> data[sample(100, 10)] <- NA
> data <- as.data.frame(data)
>
> zero <- sapply(data, function(x) sum(x == 0, na.rm = TRUE))
> na <- sapply(data, function(x) sum(is.na(x)))
> totals <- nrow(data) - zero - na  # totals non zero per column
> grand_total <- sum(totals)# total non zero
>
> totals
> # V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
> #  6   8   8   8   8   7   7   8   6  10
>
> grand_total
> #[1] 76
>
> # another way
> prod(dim(data)) - sum(zero + na)
> #[1] 76
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 29-10-2017 10:25, Engin YILMAZ escreveu:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in this
>> calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
>> (c)
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> > =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> > =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pass Parameters to RScript?

2017-10-30 Thread Eric Berger

I did a simple search and got  hits immediately, e.g.
https://www.r-bloggers.com/passing-arguments-to-an-r-script-from-command-lines/

On Mon, Oct 30, 2017 at 2:30 PM, Morkus via R-help 
wrote:

> Is it possible to pass parameters to an R Script, say, from Java or other
> language?
>
> I did some searches, but came up blank.
>
> Thanks very much in advance,
>
> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted
> email.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pass Parameters to RScript?

2017-10-30 Thread Eric Berger

I do not program in Java but it seems a Java program can make system calls
which would be equivalent to running from the command line, but done from
within a Java program. Not sure whether that would meet your needs and if
not why not. Just a suggestion.

Check out

http://www.java-samples.com/showtutorial.php?tutorialid=8



On Mon, Oct 30, 2017 at 5:10 PM, Morkus  wrote:

> Thanks Eric,
>
> I saw that page, too, but it states:
>
> "This post describes how to pass external arguments to *R* when calling a
> Rscript *with a command line.*"
>
> Not what I'm trying to do.
>
> Thanks for your reply.
>
> Sent from ProtonMail , Swiss-based encrypted
> email.
>
>
>  Original Message 
> Subject: Re: [R] Pass Parameters to RScript?
> Local Time: October 30, 2017 9:39 AM
> UTC Time: October 30, 2017 1:39 PM
> From: ericjber...@gmail.com
> To: Morkus 
> r-help@r-project.org 
>
> I did a simple search and got  hits immediately, e.g.
> https://www.r-bloggers.com/passing-arguments-to-an-r-
> script-from-command-lines/
>
>
> On Mon, Oct 30, 2017 at 2:30 PM, Morkus via R-help 
> wrote:
>
>> Is it possible to pass parameters to an R Script, say, from Java or other
>> language?
>>
>> I did some searches, but came up blank.
>>
>> Thanks very much in advance,
>>
>> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted
>> email.
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convertTime package.

2017-10-31 Thread Eric Berger

If you need a function (e.g. convertTime ) from a package (unknown?) then
you cannot simply instruct R to install the function.
e.g. if you give the command
> install.packages("convertTime")
you will get an error message like
"package 'convertTime' is not available (for R version 3.4.1)"

I did a Google search and found a package called SGP that has a function
"convertTime". I have no idea if it is the function you are looking for but
installing that package worked fine in R version 3.4.1.  So you can try

> install.packages("SGP")

HTH,
Eric


On Tue, Oct 31, 2017 at 9:15 PM, Sarah Goslee 
wrote:

> Hi Scott,
>
> Where did you get this function originally? I can't find anything about it.
>
> What OS are you using?
>
> What says, "not available for the version"? Where are you getting that
> error?
>
> What are you trying to accomplish? What does that function actually
> do? It's impossible to suggest a work-around for a function of unknown
> purpose and origin.
>
> (The posting guide for this list suggests you include all of that
> information when you inquire.)
>
> Sarah
>
>
> On Tue, Oct 31, 2017 at 2:04 PM, Scott Anderwald via R-help
>  wrote:
> > To whom it might concern.  I am working on a project that needs the
> convertTime function. I am currently using version 3.4.1 and it says not
> available for the version.  Two questions is there a work around for the
> function or is there another package that contains that functions.
> >
> >
> > Thanks,
> >
> >
> > Scott Anderwald
>
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beta binomial distribution installation

2017-11-01 Thread Eric Berger

Hi,
I did a quick search for other packages that provide the beta binomial
distribution and found "rmutil".

> install.packages("rmutil")

The package has the CDF (pbetabinom) and inverse CDF (qbetabinom) among
other functions.

HTH,
Eric



On Wed, Nov 1, 2017 at 7:50 AM, MCGUIRE, Rhydwyn <
rm...@doh.health.nsw.gov.au> wrote:

> Hi there,
>
> It looks like you also need the bioconductor package biobase, I found
> instructions for downloading that package here:
> www.bioconductor.org/install
>
> Good luck.
>
> Cheers,
> Rhydwyn
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Amany
> Abdel-Karim
> Sent: Wednesday, 1 November 2017 2:13 PM
> To: r-h...@stat.math.ethz.ch
> Subject: [R] beta binomial distribution installation
>
> Hello,
>
> I  tried to install package TailRank using the command install.packages
> (RankTail) and library (TailRank) but I got the following errors. So, how
> can I install TaiRank in Rstudio to have se beta-binomial distribution, CDF
> and inverse CDG of  beta-binomal?
>
> The commands I used are:
>
> > install.packages("TailRank")
>
> Installing package into C:/Users/stator-guest/Documents/R/win-library/3.4
>
> (as lib is unspecified)
>
> Warning in install.packages :
>
>   dependency Biobase is not available
>
> trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/TailRank_
> 3.1.3.zip'
>
> Content type 'application/zip' length 331270 bytes (323 KB)
>
> downloaded 323 KB
>
>
>
> package TailRank successfully unpacked and MD5 sums checked
>
>
>
> The downloaded binary packages are in
>
> C:\Users\stator-guest\AppData\Local\Temp\RtmpoVx40V\
> downloaded_packages
>
> > library(TailRank)
>
> Error: package or namespace load failed for TailRank in loadNamespace(i,
> c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
>
>  there is no package called Biobase
>
> In addition: Warning message:
>
> package TailRank was built under R version 3.4.2
>
>
>
>
> [[alternative HTML version deleted]]
>
> 
> __
> This email has been scanned for the NSW Ministry of Health by the Websense
> Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW
> Ministry of health's Electronic Messaging Policy.
> 
> __
> 
> ___
> Disclaimer: This message is intended for the addressee named and may
> contain confidential information.
> If you are not the intended recipient, please delete it and notify the
> sender.
> Views expressed in this message are those of the individual sender, and
> are not necessarily the views of the NSW Ministry of Health.
> 
> ___
> This email has been scanned for the NSW Ministry of Health by the Websense
> Hosted Email Security System.
> Emails and attachments are monitored to ensure compliance with the NSW
> Ministry of Health's Electronic Messaging Policy.
> 
> ___
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Some comments:
1. sink() does not return a value. There is on point to set attr <-
sink(...). Just give the command sink("C://etc")
2. to complete the saving to the file you must give a second sink command
with no argument:  sink()
So your code would be (pseudo-code, not actual code)

sink( "filename" )
do something that prints output which will be captured by sink
sink()

HTH,
Eric

On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu via R-help  wrote:

> Hi,I want the results to be saved automatically in a output text file
> after the script has finished running.
>
> I used the sink function in the following example, but the results file
> (output.txt) was empty.
>
> net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all
> genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")#
> First I loaded theinput file for which I want to identify attractors
> attr <- sink("C://Users//Priya//Desktop//Attractor analysis_all
> genes//synaptogenesis//output.txt")# used the sink function to save the
> results from attr function
>
> attr <- getAttractors(net, type="asynchronous")# then ran the script for
> identifying attractors
> Is there any function to save the results before setting the script to
> run, so that results are automatically saved in a text file after the
> script has finished running?
>
> Thank youPriya
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Hi Priya,

You did not follow the logic of the pseudo-code.
The sink("filename"), sink() pair captures whatever output is generated
between the first sink statement and the second sink statement.
You need (possibly) to do:

sink("C://Users//Priya//Desktop//Attractor analysis_all
genes//synaptogenesis//attr.txt")


net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all
genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")

attr <- getAttractors(net, type="asynchronous")


sink()


HTH,

Eric






On Wed, Nov 1, 2017 at 4:10 PM, Priya Arasu  wrote:

> Hi Eric,
> I tried as you suggested but I could not find the output in the text file
> I created (attr.txt)
>
> net <- loadNetwork("C://Users//Priya//Desktop//Attractor analysis_all 
> genes//synaptogenesis//regulationof_dopamine_signaling_submodule3.txt")
>
> sink("C://Users//Priya//Desktop//Attractor analysis_all 
> genes//synaptogenesis//attr.txt")
>
>
> sink()
>
> attr <- getAttractors(net, type="asynchronous")
>
>
> Priya
>
>
> On Wednesday, 1 November 2017 6:54 PM, Eric Berger 
> wrote:
>
>
> Some comments:
> 1. sink() does not return a value. There is on point to set attr <-
> sink(...). Just give the command sink("C://etc")
> 2. to complete the saving to the file you must give a second sink command
> with no argument:  sink()
> So your code would be (pseudo-code, not actual code)
>
> sink( "filename" )
> do something that prints output which will be captured by sink
> sink()
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu via R-help <
> r-help@r-project.org> wrote:
>
> Hi,I want the results to be saved automatically in a output text file
> after the script has finished running.
>
> I used the sink function in the following example, but the results file
> (output.txt) was empty.
>
> net <- loadNetwork("C://Users//Priya/ /Desktop//Attractor analysis_all
> genes//synaptogenesis// regulationof_dopamine_ signaling_submodule3.txt")#
> First I loaded theinput file for which I want to identify attractors
> attr <- sink("C://Users//Priya// Desktop//Attractor analysis_all
> genes//synaptogenesis//output. txt")# used the sink function to save the
> results from attr function
>
> attr <- getAttractors(net, type="asynchronous")# then ran the script for
> identifying attractors
> Is there any function to save the results before setting the script to
> run, so that results are automatically saved in a text file after the
> script has finished running?
>
> Thank youPriya
>
>
>
> [[alternative HTML version deleted]]
>
> __ 
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/ listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beta binomial distribution installation

2017-11-01 Thread Eric Berger

Hi Amany,
I had no trouble installing TailRank and bioconductor using the link Rhydwyn
 provided.
I was curious about your statement that TailRank uses a different
parameterization for the betabinomial distribution than rmutil.
I looked at the documentation for the two packages and the transformation
to go from one to the other is straightforward.
If you want, you can do the following.
Let (N,u,v) be the parameters used in TailRank and (N,m,s) the parameters
used in rmutil. The correspondence is:

N = N, m  = u/(u+v), s = u+v.

This means you can define functions such as:

mypbb <- function(x,N,u,v) { rmutil::pbetabinom(x,N,u/(u+v),u+v) }   # CDF
myqbb <- function(x,N,u,v) { rmutil:qbetabinom{x,N,u/(u+v),u+v) }  #
inverse CDF

HTH,
Eric





On Wed, Nov 1, 2017 at 6:09 PM, Amany Abdel-Karim 
wrote:

> Hello,
>
> Thank you for your response. I need to install RankTail package since it
> contains the beta binomial distribution, CDF and inverse CDF in the usual
> form which I need to use. However rmutil package contain unusual forms for
> these functions. So it is easier for me to deal with the forms are
> contained in RankTail.
>
> I tried to install  bioconductor package, using the following commands
> but I still got the following errors:
>
>
>  (1) I tried biocLite() and then library ("TailRank"), I got the following
> errors.
>
> > biocLite()Error in biocLite() : could not find function "biocLite"> 
> > library("TailRank")Loading required package: oompaBaseError: package or 
> > namespace load failed for ‘TailRank’ in loadNamespace(i, c(lib.loc, 
> > .libPaths()), versionCheck = vI[[i]]):
>  there is no package called ‘Biobase’In addition: Warning messages:1: package 
> ‘TailRank’ was built under R version 3.4.2 2: package ‘oompaBase’ was built 
> under R version 3.4.2
>
>
> (2) I tried to write the command biocLite(), then biocLite("TailRank"), I got 
> the following errors:
>
> > biocLite()Error in biocLite() : could not find function "biocLite"> 
> > biocLite("ilRank")Error in biocLite("ilRank") : could not find function 
> > "biocLite"> biocLite()Error in biocLite() : could not find function 
> > "biocLite"> biocLite("TailRank")Error in biocLite("TailRank") : could not 
> > find function "biocLite"
>
> >
>
>
> Also, I checked under packages on the right side of the R window and I
> found TailRank , Description is Tail-Rank statistic, and version is 3.1.3.
> So, I tried to write the following code in the console window to check if
> the package works:
>
> > N<-20> u<-3> v<-10> p<-u/u+v> x<-0:N
>
> > yy<-dbb(x,N,u,v)
>
>
> I got the following error:Error in dbb(x, N, u, v) : could not find function 
> "dbb"
>
> >
>
> I am confused because if the package TailRank is already there, why the
> pervious code does not work to calculate dbb (x,N,u,v) and I got error? If
> I do not have the package, would you please let me know the right commands
> I should write in the script window to install TaiRank because the commands
> I used (which I mentioned at the beginning of the email did not work and
> gave errors). I appreciate your help since I am a new user of R.
>
>
> Amany
>
>
>
> --
>
> *From:* Eric Berger 
> *Sent:* Wednesday, November 1, 2017 2:42 AM
> *To:* MCGUIRE, Rhydwyn
> *Cc:* Amany Abdel-Karim; r-h...@stat.math.ethz.ch
> *Subject:* Re: [R] beta binomial distribution installation
>
> Hi,
> I did a quick search for other packages that provide the beta binomial
> distribution and found "rmutil".
>
> > install.packages("rmutil")
>
> The package has the CDF (pbetabinom) and inverse CDF (qbetabinom) among
> other functions.
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 7:50 AM, MCGUIRE, Rhydwyn <
> rm...@doh.health.nsw.gov.au> wrote:
>
>> Hi there,
>>
>> It looks like you also need the bioconductor package biobase, I found
>> instructions for downloading that package here:
>> www.bioconductor.org/install
>>
>> Good luck.
>>
>> Cheers,
>> Rhydwyn
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Amany
>> Abdel-Karim
>> Sent: Wednesday, 1 November 2017 2:13 PM
>> To: r-h...@stat.math.ethz.ch
>> Subject: [R] beta binomial distribution installation
>>
>> Hello,
>>
>> I  tried to install package TailRank using the command install.packages
>> (RankTail) and library (TailRank) but I got the following

Re: [R] Correct subsetting in R

2017-11-01 Thread Eric Berger

matches <- merge(training,data,by=intersect(names(training),names(data)))

HTH,
Eric


On Wed, Nov 1, 2017 at 6:13 PM, Elahe chalabi via R-help <
r-help@r-project.org> wrote:

> Hi all,
> I have two data frames that one of them does not have the column ID:
>
> > str(data)
> 'data.frame':   499 obs. of  608 variables:
> $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> $ alright  : int  1 0 0 0 0 0 0 1 2 1 ...
> $ bad  : int  1 0 0 0 0 0 0 0 0 0 ...
> $ boy  : int  1 2 1 1 0 2 2 4 2 1 ...
> $ cooki: int  1 2 2 1 0 1 1 4 2 3 ...
> $ curtain  : int  1 0 0 0 0 2 0 2 0 0 ...
> $ dish : int  2 1 0 1 0 0 1 2 2 2 ...
> $ doesnt   : int  1 0 0 0 0 0 0 0 1 0 ...
> $ dont : int  2 1 4 2 0 0 2 1 2 0 ...
> $ fall : int  3 1 0 0 1 0 1 2 3 2 ...
> $ fell : int  1 0 0 0 0 0 0 0 0 0 ...
>
> and the other one is:
>
> > str(training)
> 'data.frame':   375 obs. of  607 variables:
> $ alright  : num  1 0 0 0 1 2 1 0 0 0 ...
> $ bad  : num  1 0 0 0 0 0 0 0 0 0 ...
> $ boy  : num  1 1 2 2 4 2 1 0 1 0 ...
> $ cooki: num  1 1 1 1 4 2 3 1 2 2 ...
> $ curtain  : num  1 0 2 0 2 0 0 0 0 0 ...
> $ dish : num  2 1 0 1 2 2 2 1 4 1 ...
> $ doesnt   : num  1 0 0 0 0 1 0 0 0 0 ...
> $ dont : num  2 2 0 2 1 2 0 0 1 0 ...
> $ fall : num  3 0 0 1 2 3 2 0 2 0 ...
> $ fell : num  1 0 0 0 0 0 0 0 0 0 ...
> Does anyone know how should I get the IDs of training from data?
> thanks for any help!
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to save results

2017-11-01 Thread Eric Berger

Hi Priya,
I think your original question may have been phrased in a way that caused
David and me some confusion.
I think sink() may not be the function that is appropriate in your case.
Sink() is used to capture output to the console (so to speak).
You are trying to save the results of calculations returned, in this case
in the variable 'attr'.
You need to do something like:

attr <- getAttractors( ... )
saveRDS( attr, "filename.RDS")

and then later you can read the results back in another R session:

savedAttr <- readRDS("filename.RDS")

Look at the documentation ?saveRDS and ?readRDS

HTH,
Eric

On Wed, Nov 1, 2017 at 6:02 PM, David L Carlson  wrote:

> Let's try a simple example.
>
> > # Create a script file of commands
> > # Note we must print the results of quantile explicitly
> > cat("x <- rnorm(50)\nprint(quantile(x))\nstem(x)\n", file="Test.R")
> >
> > # Test it by running it to the console
> > source("Test.R")
> 0%25%50%75%   100%
> -2.4736219 -0.7915433 -0.1178056  0.7023577  2.9158617
>
>   The decimal point is at the |
>
>   -2 | 510
>   -1 | 7631110
>   -0 | 998877733211
>0 | 011244889
>1 | 00045
>2 | 19
>
> >
> > # Now run it and save the file
> > sink("Testout.txt")
> > source("Test.R")
> > sink()
> >
> > # What is located in "Testout.txt"?
> > cat(readLines("Testout.txt"), sep="\n")
>  0% 25% 50% 75%100%
> -2.47511893 -0.47919111  0.05761628  0.67403447  1.79825459
>
>   The decimal point is at the |
>
>   -2 | 5
>   -2 | 4
>   -1 |
>   -1 | 432000
>   -0 | 87755
>   -0 | 442110
>0 | 001244
>0 | 556789
>1 | 113
>1 | 5788
>
> > # Success
>
> Depending on your operating system, you may also be able to save the
> output with File | Save to File.
>
> ---
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Priya
> Arasu via R-help
> Sent: Wednesday, November 1, 2017 9:57 AM
> To: Eric Berger 
> Cc: r-help@r-project.org
> Subject: Re: [R] Function to save results
>
> Hi Eric,Thanks for the explanation. Is there a way to save the results
> automatically after the analysis gets over?. As I recently lost the
> results, because I didn't save the results. I don't want to run the sink or
> save command after the analysis is over rather run the command for saving
> the file before starting to run the analysis, so the file gets saved
> automatically after the script has finished running Priya
>
>
>
> On Wednesday, 1 November 2017 7:53 PM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>  Hi Priya,
> You did not follow the logic of the pseudo-code. The sink("filename"),
> sink() pair captures whatever output is generated between the first sink
> statement and the second sink statement.You need (possibly) to do:
> sink("C://Users//Priya// Desktop//Attractor analysis_all
> genes//synaptogenesis//attr. txt") net <- loadNetwork("C://Users//Priya/
> /Desktop//Attractor analysis_all genes//synaptogenesis//
> regulationof_dopamine_ signaling_submodule3.txt")attr <- getAttractors(net,
> type="asynchronous")
> sink()
> HTH,Eric
>
>
>
> On Wed, Nov 1, 2017 at 4:10 PM, Priya Arasu 
> wrote:
>
> Hi Eric,I tried as you suggested but I could not find the output in the
> text file I created (attr.txt)
>
> net <- loadNetwork("C://Users//Priya/ /Desktop//Attractor analysis_all
> genes//synaptogenesis// regulationof_dopamine_ 
> signaling_submodule3.txt")sink("C://Users//Priya//
> Desktop//Attractor analysis_all genes//synaptogenesis//attr. txt")
>
>
> sink()
>
> attr <- getAttractors(net, type="asynchronous")
>  Priya
>
>
> On Wednesday, 1 November 2017 6:54 PM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>  Some comments:1. sink() does not return a value. There is on point to set
> attr <- sink(...). Just give the command sink("C://etc")2. to complete
> the saving to the file you must give a second sink command with no
> argument:  sink()So your code would be (pseudo-code, not actual code) sink(
> "filename" )do something that prints output which will be captured by
> sinksink() HTH,Eric
>
>
> On Wed, Nov 1, 2017 at 1:32 PM, Priya Arasu

Re: [R] Correct subsetting in R

2017-11-01 Thread Eric Berger

training$TrainingRownum <- 1:nrow(training)
data$DataRownum <- 1:nrow(data)
matches <- merge(training,data,by=intersect(names(training),names(data)))

The data frame 'matches' now has additional columns telling you the row in
each data frame corresponding to the matched items.

Regards,
Eric

On Wed, Nov 1, 2017 at 9:29 PM, Elahe chalabi 
wrote:

>
> It's not what I want, the first data frame has 499 observations and the
> second data frame is a subset of the first one but with 375 observations. I
> want something that returns the ID for training data frame
>
>
> On Wednesday, November 1, 2017 10:18 AM, Eric Berger <
> ericjber...@gmail.com> wrote:
>
>
>
> matches <- merge(training,data,by=intersect(names(training),names(data)))
>
> HTH,
> Eric
>
>
>
> On Wed, Nov 1, 2017 at 6:13 PM, Elahe chalabi via R-help <
> r-help@r-project.org> wrote:
>
> Hi all,
> >I have two data frames that one of them does not have the column ID:
> >
> >> str(data)
> >'data.frame':   499 obs. of  608 variables:
> >$ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> >$ alright  : int  1 0 0 0 0 0 0 1 2 1 ...
> >$ bad  : int  1 0 0 0 0 0 0 0 0 0 ...
> >$ boy  : int  1 2 1 1 0 2 2 4 2 1 ...
> >$ cooki: int  1 2 2 1 0 1 1 4 2 3 ...
> >$ curtain  : int  1 0 0 0 0 2 0 2 0 0 ...
> >$ dish : int  2 1 0 1 0 0 1 2 2 2 ...
> >$ doesnt   : int  1 0 0 0 0 0 0 0 1 0 ...
> >$ dont : int  2 1 4 2 0 0 2 1 2 0 ...
> >$ fall : int  3 1 0 0 1 0 1 2 3 2 ...
> >$ fell : int  1 0 0 0 0 0 0 0 0 0 ...
> >
> >and the other one is:
> >
> >> str(training)
> >'data.frame':   375 obs. of  607 variables:
> >$ alright  : num  1 0 0 0 1 2 1 0 0 0 ...
> >$ bad  : num  1 0 0 0 0 0 0 0 0 0 ...
> >$ boy  : num  1 1 2 2 4 2 1 0 1 0 ...
> >$ cooki: num  1 1 1 1 4 2 3 1 2 2 ...
> >$ curtain  : num  1 0 2 0 2 0 0 0 0 0 ...
> >$ dish : num  2 1 0 1 2 2 2 1 4 1 ...
> >$ doesnt   : num  1 0 0 0 0 1 0 0 0 0 ...
> >$ dont : num  2 2 0 2 1 2 0 0 1 0 ...
> >$ fall : num  3 0 0 1 2 3 2 0 2 0 ...
> >$ fell : num  1 0 0 0 0 0 0 0 0 0 ...
> >Does anyone know how should I get the IDs of training from data?
> >thanks for any help!
> >Elahe
> >
> >__ 
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/ listinfo/r-help
> >PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Records to a Table in R

2017-11-01 Thread Eric Berger

Hi Paul,

#First I set up some sample data since I don't have a copy of your data
dtOrig <- as.Date( c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
CargoTons=c(1000,1080,3785,4200) )

#Generate the complete set of dates as a data frame
dfDates<- data.frame( TransitDate=seq(from=as.Date("1985-04-01"),by="1
month",length=13) )

# do the merge adding the "missing" rows (where NA will appear)
dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )

# replace the NA's by zero
dfNew[is.na(dfNew)] <- 0

HTH,
Eric


On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal  wrote:

> Dear R friends,
>
> I am currently working with time series data, and I have a table(as data
> frame) that has looks like this (TransitDate are in format = "%e-%B-%Y") :
>
> TransitDate   Transits  CargoTons
> 1985-04-011002500
> 1985-05-011354500
> 1985-06-011201750
> 1985-07-011003750
> 1985-08-012001250
>
> The problem is, that there are several periods that don´t exist in the
> table, so it has the following behavior:
>
> TransitDateTransits  CargoTons
> 1985-04-01100 1000
> 1985-07-01100 1080
> 1985-12-01500 3785
> 1986-04-01325 4200
> .
> .
> 2017-09-01400 2350 (*this is the last observation)
>
> You can see in the last table fragment that the series jumps from
> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making the
> time series quite irregular (non-constant chronologically speaking).
>
> What I want to do is create a dummy table that has the sequence from the
> first observation (1985-04-01) up to the last one (2017-09-01) and then
> develop a code that checks if the dates contained in the dummy table exist
> in the original table, if they don´t exist then add those dates and put
> zeroes on the fields.
>
> How can I achieve this?
>
> Any help will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: Time Series

2017-11-07 Thread Eric Berger

Following Erin's pointer:

library(zoo)
times <- seq(from=as.POSIXct("2015-12-18 00:00:00"),
to=as.POSIXct("2017-10-24 23:00:00"), by="hour")
mydata <- rnorm(length(times))
tseri  <- zoo( x=mydata, order.by=times )

HTH,
Eric


On Tue, Nov 7, 2017 at 9:59 AM, Erin Hodgess 
wrote:

> Hello!
>
> What is the error message, please?
>
> At first glance, you are using the "ts" function.  That doesn't work for
> hourly frequency.
>
> You may want to create a zoo object.
>
> This is Round One.
>
> Sincerely,
> Erin
>
>
> On Tue, Nov 7, 2017 at 1:46 AM, Emre Karagülle 
> wrote:
>
> >
> > Hi,
> > I would like to ask a question about time series.
> > I am trying to convert my data into time series data.
> > I have hourly data from “2015-12-18 00:00” to “2017-10-24 23:00”
> > I am trying the following codes but they are not working.
> > Could you help me out?
> >
> > tseri <- ts(data ,seq(from=as.POSIXct("2015-12-18 00:00:00"),
> > to=as.POSIXct("2017-10-24 23:00:00"), by="hour"))
> >
> > tseri <- ts(data ,seq(from=as.Date("2015-12-18 00:00:00"),
> > to=as.Date("2017-10-24 23:00:00"), by="hour"))
> >
> >
> > Thank you
> >
> > --
> > Emre
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Mathematical and Statistics
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: FW: Time Series

2017-11-07 Thread Eric Berger

[Please send replies to r-help, not individual  responders]
Emre,
In R, when you call a function defined via something like
f <- function( foo, bar )
then you can call it as, for, example

a <- f(x,y)

or

a <- f(foo=x, bar=y)

or even

a <- f( bar=y, foo=x)   # notice I switched the order!

The first approach requires you to pass the arguments in the same order as
the function is expecting.
In the second and third examples you can pass the arguments in any order
you want since you have indicated the variable names.
The point is that the variable name goes on the left of the '=' and the
values (or variables) you are passing go on the right of the '='.

The zoo function is defined as
zoo( x = NULL, order.by = index(x), etc ... )

In my example code I passed the variable 'mydata' to the parameter 'x' via

zoo(x=mydata, order.by=times)

If you called your local variable x then you can call zoo via

zoo(x=x, order.by=whatever)  # using the 'named parameters' approach

or

zoo(x, order.by=whatever)   # where zoo will match the first argument to
the first parameter in its definition.

Hopefully this will help you understand why some of your attempts worked
and some did not work.

Regards,
Eric

Hi Erin and Eric

As both of you suggested I followed the Erin’s command

It  is failed with the following command

when I wrote x , which is numeric vector. I says that unused argument.

tseri  <- zoo( x=mydata, order.by=times )

when use  it without x=mydata like,

tseri  <- zoo( x, order.by=times )

it works.

I checked it by following command

x[times==as.POSIXct("2015-12-18 02:00:00")] and it gave me the true value.

Do you think it is okay?

By the way, I appreciate for fast reply.

Thank you.

--
Emre

*From: *Eric Berger 
*Sent: *Tuesday, November 7, 2017 11:08 AM
*To: *Erin Hodgess 
*Cc: *Emre Karagülle ; r-help@r-project.org
*Subject: *Re: [R] FW: Time Series

Following Erin's pointer:

library(zoo)
times <- seq(from=as.POSIXct("2015-12-18 00:00:00"),
to=as.POSIXct("2017-10-24 23:00:00"), by="hour")
mydata <- rnorm(length(times))
tseri  <- zoo( x=mydata, order.by=times )

HTH,

Eric

On Tue, Nov 7, 2017 at 9:59 AM, Erin Hodgess 
wrote:

Hello!

What is the error message, please?

At first glance, you are using the "ts" function.  That doesn't work for
hourly frequency.

You may want to create a zoo object.

This is Round One.

Sincerely,
Erin

On Tue, Nov 7, 2017 at 1:46 AM, Emre Karagülle 
wrote:

>
> Hi,
> I would like to ask a question about time series.
> I am trying to convert my data into time series data.
> I have hourly data from “2015-12-18 00:00” to “2017-10-24 23:00”
> I am trying the following codes but they are not working.
> Could you help me out?
>
> tseri <- ts(data ,seq(from=as.POSIXct("2015-12-18 00:00:00"),
> to=as.POSIXct("2017-10-24 23:00:00"), by="hour"))
>
> tseri <- ts(data ,seq(from=as.Date("2015-12-18 00:00:00"),
> to=as.Date("2017-10-24 23:00:00"), by="hour"))
>
>
> Thank you
>
> --
> Emre
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Erin Hodgess
Associate Professor
Department of Mathematical and Statistics
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitdistrplus and Custom Probability Density

2017-11-07 Thread Eric Berger

Why not define your own functions based on d?
e.g.
myCumDist <- function(x) { integrate(d, lower=-Inf, upper=x)$value  }
myQuantile <- function(x) { uniroot(f=function(y) { h(y) - x },
interval=c(-5,5)) }  # limits -5,5 should be replaced by your own which
might require some fiddling

e.g.
d <- function(x) { exp(-x^2/2)/(sqrt(2*pi)) }  # just an example for you to
test with; use your own density d(x) in your case

Then define myCumDist, myQuantile as above and compare with pnorm, qnorm.

HTH,
Eric




On Tue, Nov 7, 2017 at 4:22 PM, Lorenzo Isella 
wrote:

> Dear All,
> Apologies for not providing a reproducible example, but if I could, then I
> would be able to answer myself my question.
> Essentially, I am trying to fit a very complicated custom probability
> distribution to some data.
> Fitdistrplus does in principle everything which I need, but if require me
> to specify not only the density function d, but also the cumulative p and
> and inverse cumulative function q (see for instance
>
> http://www.stat.umn.edu/geyer/old/5101/rlook.html
>
> to understand what these quantities are in the case of a normal
> distribution).
>
> The analytical calculation of p and q is a big task in my case, so my
> question is if there is a workaround for this, i.e. a way to fit the
> unknown parameters of my probability distribution without specifying (at
> least analytically) p and q, but only the density d.
> Many thanks
>
> Lorenzo
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ggplot error

2017-11-08 Thread Eric Berger

I was not able to reproduce this problem. I tried two environments
1. Ubuntu 14.04.5 LTS, R version 3.4.2 (same R version as yours)
2. Windows 10, same R version



On Wed, Nov 8, 2017 at 9:50 AM, Zeki ÇATAV  wrote:

> Hello,
> I've an error recently.
>
> ggplot(data = mtcars, aes(x= wt, y= mpg)) + geom_line()
> Error: Found object is not a stat.
>
> > sessionInfo()
> R version 3.4.2 (2017-09-28)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.3 LTS
>
> Matrix products: default
> BLAS: /usr/lib/openblas-base/libblas.so.3
> LAPACK: /usr/lib/libopenblasp-r0.2.18.so
>
> locale:
>  [1] LC_CTYPE=tr_TR.UTF-8   LC_NUMERIC=C
>  LC_TIME=tr_TR.UTF-8
>  [4] LC_COLLATE=tr_TR.UTF-8 LC_MONETARY=tr_TR.UTF-8
> LC_MESSAGES=tr_TR.UTF-8
>  [7] LC_PAPER=tr_TR.UTF-8   LC_NAME=C  LC_ADDRESS=C
>
> [10] LC_TELEPHONE=C LC_MEASUREMENT=tr_TR.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] dplyr_0.7.4 purrr_0.2.4 readr_1.1.1 tidyr_0.7.2
>  tibble_1.3.4tidyverse_1.1.1
> [7] ggplot2_2.2.1
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.13   lubridate_1.7.1lattice_0.20-35class_7.3-14
>  assertthat_0.2.0
>  [6] ipred_0.9-6psych_1.7.8foreach_1.4.3  R6_2.2.2
>  cellranger_1.1.0
> [11] plyr_1.8.4 stats4_3.4.2   httr_1.3.1 rlang_0.1.4
>   lazyeval_0.2.1
> [16] caret_6.0-77   readxl_1.0.0   kernlab_0.9-25 rpart_4.1-11
>  Matrix_1.2-11
> [21] splines_3.4.2  CVST_0.2-1 ddalpha_1.3.1  gower_0.1.2
>   stringr_1.2.0
> [26] foreign_0.8-69 munsell_0.4.3  broom_0.4.2
> compiler_3.4.2 modelr_0.1.1
> [31] pkgconfig_2.0.1mnormt_1.5-5   dimRed_0.1.0   nnet_7.3-12
>   prodlim_1.6.1
> [36] DRR_0.0.2  codetools_0.2-15   RcppRoll_0.2.2 withr_2.1.0
>   MASS_7.3-47
> [41] recipes_0.1.0  ModelMetrics_1.1.0 grid_3.4.2 nlme_3.1-131
>  jsonlite_1.5
> [46] gtable_0.2.0   magrittr_1.5   scales_0.5.0
>  stringi_1.1.5  reshape2_1.4.2
> [51] bindrcpp_0.2   timeDate_3012.100  robustbase_0.92-8  xml2_1.1.1
>  lava_1.5.1
> [56] iterators_1.0.8tools_3.4.2forcats_0.2.0  glue_1.2.0
>  DEoptimR_1.0-8
> [61] sfsmisc_1.1-1  hms_0.3parallel_3.4.2
>  survival_2.41-3yaml_2.1.14
> [66] colorspace_1.3-2   rvest_0.3.2bindr_0.1  haven_1.1.0
>
>
> > conflicts(detail = TRUE)
> $.GlobalEnv
> [1] "iris"
>
> $`package:dplyr`
>  [1] "%>%"   "%>%"   "add_row"   "as_data_frame"
> "as_tibble" "data_frame"
>  [7] "data_frame_"   "frame_data""glimpse"   "lst"
>  "lst_"  "tbl_sum"
> [13] "tibble""tribble"   "trunc_mat" "type_sum"
> "filter""lag"
> [19] "intersect" "setdiff"   "setequal"  "union"
>
> $`package:purrr`
> [1] "%>%" "%>%"
>
> $`package:tidyr`
> [1] "%>%" "%>%"
>
> $`package:tibble`
>  [1] "add_row"   "as_data_frame" "as_tibble" "data_frame"
> "data_frame_"   "frame_data"
>  [7] "glimpse"   "lst"   "lst_"  "tbl_sum"
>  "tibble""tribble"
> [13] "trunc_mat" "type_sum"
>
> $`package:ggplot2`
> [1] "Position"
>
> $`package:stats`
> [1] "filter" "lag"
>
> $`package:datasets`
> [1] "iris"
>
> $`package:methods`
> [1] "body<-""kronecker"
>
> $`package:base`
> [1] "body<-""intersect" "kronecker" "Position"  "setdiff"   "setequal"
> "union"
>
>
> How can I solve this problem?
> Thanks.
>
> --
> Zeki Çatav
> zekicatav.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Records to a Table in R

2017-11-08 Thread Eric Berger

Hi Paul,
The following worked for me:

library(lubridate)
dataset1 <- read.csv("dataset1.csv",stringsAsFactors=FALSE)
dataset1$TransitDate <- mdy(dataset1$TransitDate)
TransitDateFrame <- data.frame(TransitDate=seq(as.Date("1985-10-01"),
as.Date("2017-10-01"), by = "month"))
dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
all.x=TRUE)

HTH,
Eric



On Wed, Nov 8, 2017 at 4:32 PM, PIKAL Petr  wrote:

> Hi
>
> Instead of attachments copy directly result of dput(TransitDateFrame) and
> dput(dataset1) to your email. Or, if your data have more than about 20 rows
> you could copy only part of it.
>
> dput(TransitDateFrame[,1:20])
> dput(dataset1[,1:20])
>
> Only with this approach we can evaluate your data in all aspects and
> provide correct answer.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Paul
> Bernal
> > Sent: Wednesday, November 8, 2017 2:46 PM
> > To: Eric Berger 
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Adding Records to a Table in R
> >
> > Dear Eric,
> >
> > Hope you are doing great. I also tried the following:
> >
> > #First I created the complete date sequence
> >
> > TransitDateFrame <- data.frame(TransitDate=seq(as.Date(dataset1[1,1]),
> > as.Date(dataset1[nrow(dataset1),1]), by = "month"))
> >
> > #Then I did the merging
> >
> >  dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
> > all.x=TRUE)
> >
> > Now it has, as expected the total number of rows. The problem is, it
> filled
> > absolutely everything with NAs, and this shouldn´t be the case since
> there are
> > dates that actually have data.
> >
> > why is this happening?
> >
> > I am attaching the dataset1 table as a .csv document for your reference.
> > Basically what I want is to bring all the values in dataset1 and only
> add the
> > dates missing with value 0.
> >
> > Best regards,
> >
> > Paul
> >
> > 2017-11-01 15:21 GMT-05:00 Eric Berger :
> >
> > > Hi Paul,
> > >
> > > #First I set up some sample data since I don't have a copy of your
> > > data dtOrig <- as.Date(
> > > c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
> > > dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
> > > CargoTons=c(1000,1080,3785,4200) )
> > >
> > > #Generate the complete set of dates as a data frame
> > > dfDates<- data.frame( TransitDate=seq(from=as.Date("1985-04-01"),by="1
> > > month",length=13) )
> > >
> > > # do the merge adding the "missing" rows (where NA will appear) dfNew
> > > <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )
> > >
> > > # replace the NA's by zero
> > > dfNew[is.na(dfNew)] <- 0
> > >
> > > HTH,
> > > Eric
> > >
> > >
> > > On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal 
> > > wrote:
> > >
> > >> Dear R friends,
> > >>
> > >> I am currently working with time series data, and I have a table(as
> > >> data
> > >> frame) that has looks like this (TransitDate are in format =
> "%e-%B-%Y") :
> > >>
> > >> TransitDate   Transits  CargoTons
> > >> 1985-04-011002500
> > >> 1985-05-011354500
> > >> 1985-06-011201750
> > >> 1985-07-011003750
> > >> 1985-08-012001250
> > >>
> > >> The problem is, that there are several periods that don´t exist in
> > >> the table, so it has the following behavior:
> > >>
> > >> TransitDateTransits  CargoTons
> > >> 1985-04-01100 1000
> > >> 1985-07-01100 1080
> > >> 1985-12-01500 3785
> > >> 1986-04-01325 4200
> > >> .
> > >> .
> > >> 2017-09-01400 2350 (*this is the last observation)
> > >>
> > >> You can see in the last table fragment that the series jumps from
> > >> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01
> > >> making the time series quite irregular (non-constant chronologically
>

Re: [R] Calculating frequencies of multiple values in 200 colomns

2017-11-10 Thread Eric Berger

How about this workaround - add 1 to the vector
x <- c(1,0,2,1,0,2,2,0,2,1)
tabulate(x)
# [1] 3 4
tabulate(x+1)
#[1] 3 3 4


On Fri, Nov 10, 2017 at 4:34 PM, Marc Schwartz  wrote:

> Hi,
>
> To clarify the default behavior that Boris is referencing below, note the
> definition of the 'bin' argument to the tabulate() function:
>
> bin: a numeric vector ***(of positive integers)***, or a factor. Long
> vectors are supported.
>
> I added the asterisks for emphasis.
>
> This is also noted in the examples used for the function in ?tabulate at
> the bottom of the help page.
>
> The second argument, 'nbins', which defaults to max(1, bin, na.rm = TRUE),
> also affects the output:
>
> > tabulate(c(2, 3, 5))
> [1] 0 1 1 0 1
>
> In this case, with each element in the returned vector indicating how many
> 1's, 2's, 3's, 4's and 5's are present in the source vector.
>
> Compare that to:
>
> > tabulate(c(2, 3, 5), nbins = 3)
> [1] 0 1 1
>
> In the above example, 5 is ignored.
>
> Note also that tabulate(), unlike table(), does not return a named vector,
> just the frequencies.
>
> While tabulate() is used within the table() function, reviewing the code
> for the latter reveals how the default behavior of tabulate() is modified
> and preceded/wrapped in other code for use there.
>
> Regards,
>
> Marc Schwartz
>
>
> > On Nov 10, 2017, at 8:43 AM, Boris Steipe 
> wrote:
> >
> > |> x <- sample(0:2, 10, replace = TRUE)
> > |> x
> > [1] 1 0 2 1 0 2 2 0 2 1
> > |> tabulate(x)
> > [1] 3 4
> > |> table(x)
> > x
> > 0 1 2
> > 3 3 4
> >
> >
> >
> > B.
> >
> >
> >
> >> On Nov 10, 2017, at 4:32 AM, Allaisone 1 
> wrote:
> >>
> >>
> >>
> >> Thank you for your effort Bert..,
> >>
> >>
> >> I knew what is the problem now, the values (1,2,3) were only an
> example. The values I have are 0 , 1, 2 . Tabulate () function seem to
> ignore calculating the frequency of 0 values and this is my exact problem
> as the frequency of 0 values should also be calculated for the maf to be
> calculated correctly.
> >>
> >> 
> >> From: Bert Gunter 
> >> Sent: 09 November 2017 23:51:35
> >> To: Allaisone 1; R-help
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> [[elided Hotmail spam]]
> >>
> >> "For example, if I have the values : 1 , 2 , 3 in each column, applying
> Tabulate () would calculate the frequency of 1 and 2 without 3"
> >>
> >> Huh??
> >>
> >>> x <- sample(1:3,10,TRUE)
> >>> x
> >> [1] 1 3 1 1 1 3 2 3 2 1
> >>> tabulate(x)
> >> [1] 5 2 3
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1  mailto:allaiso...@hotmail.com>> wrote:
> >>
> >> Thank you so much for your replay
> >>
> >>
> >> Actually, I tried apply() function but struggled with the part of
> writing the appropriate function inside it which calculate the frequency of
> the 3 values. Tabulate () function is a good start but the problem is that
> this calculates the frequency of two values only per column which means
> that when I apply maf () function , maf value will be calculated using the
> frequency of these 2 values only without considering the frequency of the
> 3rd value. For example, if I have the values : 1 , 2 , 3 in each column,
> applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I
> need a way to calculate the frequencies of all of the 3 values so the
> calculation of maf will be correct as it will consider all the 3
> frequencies but not only 2 .
> >>
> >>
> >> Regards
> >>
> >> Allahisone
> >>
> >> 
> >> From: Bert Gunter mailto:bgunter.4...@gmail.com
> >>
> >> Sent: 09 November 2017 20:56:39
> >> To: Allaisone 1
> >> Cc: r-help@R-project.org
> >> Subject: Re: [R] Calculating frequencies of multiple values in 200
> colomns
> >>
> >> This is not a good way to do things! R has many powerful built in
> functions to do this sort of thing for you. Searching  -- e.g. at
> rseek.org or even a plain old google search -- can help
> you find them. Also, it looks like you need to go through a tutorial or two
> to learn more about R's basic functionality.
> >>
> >> In this case, something like (no reproducible example given, so can't
> confirm):
> >>
> >> apply(Values, 2, function(x)maf(tabulate(x)))
> >>
> >> should be close to what you want .
> >>
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1  mailto:allaiso...@hotmail.com>> wrote:
> >>
> >> Hi All
> >>
> >>
> >> I have a dataset of 200 columns and 1000 rows , there are 3

Re: [R] effects package x axis labels

2017-11-11 Thread Eric Berger

Hi Andras,
I have not used this package before but I did the following steps to arrive
at an answer. Hopefully both the answer is what you are looking for and
also the steps to understand how you can answer such questions yourself in
the future.
1. R is an object-oriented language, but there are several ways in which
classes are supported. In particular, methods for some classes don't reside
with the class but with extensions to "generic" functions. The 'plot'
function is such an example. So the first step is to understand the class
returned by the function allEffects.

> myObj <- allEffects(mylogit)
> class(myObj)
# efflist

2. Next look at the documentation for the extensions to 'plot' for an
'efflist' class

> ?plot.efflist

3. Search in the help documentation for 'axes' to understand what is going
on (they also supply a lot of examples at the end of the help page). A few
experiments and the following seems to do what you asked for:
> plot(allEffects(mylogit),
+
axes=list(x=list(gre=list(lab="black"),gpa=list(lab="white"),rank=list(lab="green")),
+y=list(lab="Prob(xyz)")))

HTH,
Eric

On Sat, Nov 11, 2017 at 2:20 AM, Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Dear All,
>
> probably a simple enough solution but don;t seem to be able to get my head
> around it...example based on a publicly available data set:
>
> mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv";)
> mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family =
> "binomial")
> library(effects)
> plot(allEffects(mylogit)
>  ,axes=list(y=list(lab="Prob(xyz)"))
> )
>
> axes=list(y=list(lab="Prob(xyz)")) changes the y axis labels for all 3
> plots... Any thoughts on how I could change the x axis labels to let say
> 'black' (plot 1), 'white' (plot 2) and 'green' (plot 3) for the 3
> respective plots produced?
>
>
> appreciate the help...
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R6 object that is a list of referenced object

2017-11-16 Thread Eric Berger

Hi Cristina,
You can try this:

> Community <- R6Class("Community",
   public = list(
 e = NULL,
 initialize = function() { self$e <- list() },
 add = function( person ) { self$e[[ length(self$e)
+ 1]] <<- person }
   )
  )

> crowd <- Community$new()
> crowd$add(Person1)
> crowd$add(Person2)
> crowd$e

HTH,
Eric


On Thu, Nov 16, 2017 at 9:55 AM, Jeff Newmiller 
wrote:

> See below.
>
> On Wed, 15 Nov 2017, Cristina Pascual wrote:
>
> Dear community,
>>
>> I am having a class, let's say Person,
>>
>> Person <-  R6Class("Person",
>> public = list(
>>   idPerson = NULL,
>>   name = NULL,
>>   age = NULL,
>>   initialize = function(idPerson = NA, name = NA, age
>> = NA) {
>>
>
> It is a bad idea to setup default values for all your parameters in any
> function, but particularly so for an initialization function. A Person with
> NA in the idPerson field is essentially unusable, so encouraging the
> creation of such an object is very bad practice.
>
>self$idPerson <- idPerson
>>self$name <- name
>>self$age <- age
>>   }
>> ) # public
>>
>> ) # Person
>>
>> I have created:
>> Person1 <- Person$new(1,'a',4)
>> Person2 <- Person$new(2,'b',5)
>>
>> and I also have a class Community:
>>
>> Community <- R6Class("Community",
>> public = list(
>>   e = NULL,
>>   initialize = function() self$e <- Person$new()
>>
>
> Initializing a Community with a bogus person is as bad as the idPerson
> being NA. It makes a lot more sense to have the set of persons in a
> community be the null set than to have a minimum of one person in the
> community who happens to have invalid identification.
>
> )
>> )
>>
>> I want to create
>>
>> Community1 = List
>>
>> and add Person1 and Person2 to Community1 (Community1 <-
>> Community1$add(Person1)
>>
>>  Community1 <- Community1$add(Person2)
>>
>> )
>>
>> How can I write this with R6? I cannot find the proper example in the
>> website.
>>
>> Can anybody help me?
>>
>> Thanks in advance,
>>
>
> You don't seem to be very familiar with either R or conventional
> object-oriented design. Although I am giving you a reprex below, I
> recommend that you avoid R6 until you are more familiar with how normal
> functional programming and S3 object oriented coding styles work in R.
> Using R6 as a crutch to avoid that learning process will only lead you to
> frustration and inefficient data handling. That is, this whole thing should
> just be a data frame.
>
> 
> library(R6)
> Person <-  R6Class( "Person"
>   , public = list( idPerson = NA
>  , name = NA
>  , age = NA
>  , initialize = function( idPerson
>
>  , name
>
>  , age
>
>  ) {
>
>self$idPerson <- idPerson
>
>self$name <- name
>
>self$age <- age
>
>  }
>  ) # public
>   ) # Person
>
> Person1 <- Person$new( 1, 'a', 4 )
> Person2 <- Person$new( 2, 'b', 5 )
>
> Community <- R6Class( "Community"
> , public = list( e = NULL
>
> , addPerson = function( p ) {
>
>self$e <- append( self$e, p )
>
>   }
>
> )
> )
>
> Community1 <- Community$new()
> Community1$addPerson( Person1 )
> Community1$addPerson( Person2 )
> Community1$e
> #> [[1]]
> #> 
> #>   Public:
> #> age: 4
> #> clone: function (deep = FALSE)
> #> idPerson: 1
> #> initialize: function (idPerson, name, age)
> #> name: a
> #>
> #> [[2]]
> #> 
> #>   Public:
> #> age: 5
> #> clone: function (deep = FALSE)
> #> idPerson: 2
> #> initialize: function (idPerson, name, age)
> #> name: b
>
> # Standard R approach:
> Person1a <- data.frame( idPerson = 1
>   , name = "a"
>   , age = 4
>   , stringsAsFactors = FALSE
>   )
> Person2a <- data.frame( idPerson = 2
>   , name = "b"
>   , age = 5
>   , stringsAsFactors = FALSE
>   )
> Community1a <- rbind( Person1a, Person2a )
> Community1a
> #>   idPerson name age
> #> 11a   4
> #> 22b   5
> 
>
>
>>
>>
>> [[alternative HTML version deleted]]
>>
>
> Please POST IN PLAIN TEXT FORMAT. This is a setting you must make in your
> email program, and failure to do so will lead to us seeing different things
> than you send (that is, we see varying degrees of scram

Re: [R] Risks of using "function <- package::function" ?

2017-11-17 Thread Eric Berger

As Jeff recommends, I use the pkg::fun for clarity.
However I probably use it more than needed (e.g. I use the dplyr:: prefix
on all dplyr function calls instead of just the functions with name
collisions).
Are there any tools that can be used (like a form of lint) to identify uses
of functions without the pkg:: prefix and which are part of a name
collision?
One could then edit the code to include the pkg:: prefix to disambiguate
those cases and verify via a repeated use of such a tool that there are no
outstanding cases.

Or alternative approaches to the issue?

Thanks,
Eric

On Fri, Nov 17, 2017 at 9:30 AM, Jeff Newmiller 
wrote:

> Obvious?  How about "obscurity"? Just directly use pkg::fun if you have
> name collision.
> --
> Sent from my phone. Please excuse my brevity.
>
> On November 16, 2017 4:46:15 PM PST, Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> >On 16/11/2017 4:53 PM, Boris Steipe wrote:
> >> Large packages sometimes mask each other's functions and that creates
> >a headache, especially for teaching code, since function signatures may
> >depend on which order packages were loaded in. One of my students
> >proposed using the idiom
> >>
> >> <- ::
> >>
> >> ... in a preamble, when we use just a small subset of functions from
> >a larger package. I like that idea and can't see obvious
> >disadvantages(1).
> >>
> >> Are there subtle risks to that approach?
> >
> >You might do it twice.  R isn't going to complain if you have
> >
> >filter <- stats::filter
> >
> ># some other code here...
> >
> >filter <- dplyr::filter
> >
> >in your code, but the second one will overwrite the first one.
> >
> >The normal way to handle this is in the NAMESPACE file, where you
> >should
> >have
> >
> >importFrom(stats, filter)
> >
> >If you then have
> >
> >importFrom(dplyr, filter)
> >
> >you should get an warning:
> >
> >Warning: replacing previous import ‘stats::filter’ by ‘dplyr::filter’
> >when loading ‘testpkg’.
> >
> >Duncan Murdoch
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

Hi Joe,
The centering and re-scaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.

Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY

Then he does two steps: (1) generate a matrix of random values from the
N(0,1) distribution. (2) convert them to DAILY
After initializing the matrix with random values (from N(0,1)), he now
wants to create a series of DAILY
sr_base <- 0
mu_base <- sr_base/(252.0)
sigma_base <- 1.00/(252.0)**0.5
for ( i in 1:n ) {
  m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
  m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}

On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter  wrote:

> Wrong list.
>
> Post on r-sig-finance instead.
>
> Cheers,
> Bert
>
>
>
> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>
> Hello,
>
> I'm trying to understand how to use the pbo package by looking at a
> vignette. I'm curious about a part of the vignette that creates simulated
> returns data. The package author transforms his simulated returns in a way
> that I'm unfamiliar with, and that I haven't been able to find an
> explanation for after searching around. I'm curious if I need to replicate
> the transformation with real returns. For context, here is the vignette
> (cleaned up a bit to make it reproducible):
>
> (Full vignette:
> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>
> library(pbo)
> #First, we assemble the trials into an NxT matrix where each column
> #represents a trial and each trial has the same length T. This example
> #is random data so the backtest should be overfit.`
>
> set.seed(765)
> n <- 100
> t <- 2400
> m <- data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>dimnames=list(1:t,1:n)), check.names=FALSE)
>
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
> #We can use any performance evaluation function that can work with the
> #reassembled sub-matrices during the cross validation iterations.
> #Following the original paper we can use the Sharpe ratio as
>
> sharpe <- function(x,rf=0.03/252) {
>   sr <- apply(x,2,function(col) {
> er = col - rf
> return(mean(er)/sd(er))
>   })
>   return(sr)}
> #Now that we have the trials matrix we can pass it to the pbo function
>  #for analysis.
>
> my_pbo <- pbo(m,s=8,f=sharpe,threshold=0)
>
> summary(my_pbo)
>
> Here's the portion i'm curious about:
>
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>
> Why is the data transformed within the for loop, and does this kind of
> re-scaling and re-centering need to be done with real returns? Or is this
> just something the author is doing to make his simulated returns look more
> like the real thing?
>
> Googling around turned up some articles regarding scaling volatility to the
> square root of time, but the scaling in the code here doesn't look quite
> like what I've seen. Re-scalings I've seen involve multiplying some short
> term (i.e. daily) measure of volatility by the root of time, but this isn't
> quite that. Also, the documentation for the package doesn't include this
> chunk of re-scaling and re-centering code. Documentation: https://cran.r-
> project.org/web/packages/pbo/pbo.pdf
>
> So:
>
>-
>
>Why is the data transformed in this way/what is result of this
>transformation?
>-
>
>Is it only necessary for this simulated data, or do I need to
>similarly transform real returns?
>
> I read in the posting guide that stats questions are acceptable given
> certain conditions, I hope this counts. Thanks for reading,
>
> -Joe
>
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avg.com
>  utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ___

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

[re-sending - previous email went out by accident before complete]
Hi Joe,
The centering and re-scaling is done for the purposes of his example, and
also to be consistent with his definition of the sharpe function.
In particular, note that the sharpe function has the rf (riskfree)
parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
to a DAILY rate, expressed in decimal.
That means that the other argument to this function, x, should be DAILY
returns, expressed in decimal.

Suppose he wanted to create random data from a distribution of returns with
ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal.
The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard
deviation SIGMA_D =  SIGMA_A/SQRT(252).

He calls MU_D by the name mu_base  and  SIGMA_D by the name sigma_base.

His loop now converts the random numbers in his matrix so that each column
has mean MU_D and std deviation SIGMA_D.

HTH,
Eric



On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger  wrote:

> Hi Joe,
> The centering and re-scaling is done for the purposes of his example, and
> also to be consistent with his definition of the sharpe function.
> In particular, note that the sharpe function has the rf (riskfree)
> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted
> to a DAILY rate, expressed in decimal.
> That means that the other argument to this function, x, should be DAILY
> returns, expressed in decimal.
>
> Suppose he wanted to create random data from a distribution of returns
> with ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in
> decimal.
> The equivalent DAILY
>
> Then he does two steps: (1) generate a matrix of random values from the
> N(0,1) distribution. (2) convert them to DAILY
> After initializing the matrix with random values (from N(0,1)), he now
> wants to create a series of DAILY
> sr_base <- 0
> mu_base <- sr_base/(252.0)
> sigma_base <- 1.00/(252.0)**0.5
> for ( i in 1:n ) {
>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>
> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter 
> wrote:
>
>> Wrong list.
>>
>> Post on r-sig-finance instead.
>>
>> Cheers,
>> Bert
>>
>>
>>
>> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>>
>> Hello,
>>
>> I'm trying to understand how to use the pbo package by looking at a
>> vignette. I'm curious about a part of the vignette that creates simulated
>> returns data. The package author transforms his simulated returns in a way
>> that I'm unfamiliar with, and that I haven't been able to find an
>> explanation for after searching around. I'm curious if I need to replicate
>> the transformation with real returns. For context, here is the vignette
>> (cleaned up a bit to make it reproducible):
>>
>> (Full vignette:
>> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>>
>> library(pbo)
>> #First, we assemble the trials into an NxT matrix where each column
>> #represents a trial and each trial has the same length T. This example
>> #is random data so the backtest should be overfit.`
>>
>> set.seed(765)
>> n <- 100
>> t <- 2400
>> m <- data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>dimnames=list(1:t,1:n)), check.names=FALSE)
>>
>> sr_base <- 0
>> mu_base <- sr_base/(252.0)
>> sigma_base <- 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>> #We can use any performance evaluation function that can work with the
>> #reassembled sub-matrices during the cross validation iterations.
>> #Following the original paper we can use the Sharpe ratio as
>>
>> sharpe <- function(x,rf=0.03/252) {
>>   sr <- apply(x,2,function(col) {
>> er = col - rf
>> return(mean(er)/sd(er))
>>   })
>>   return(sr)}
>> #Now that we have the trials matrix we can pass it to the pbo function
>>  #for analysis.
>>
>> my_pbo <- pbo(m,s=8,f=sharpe,threshold=0)
>>
>> summary(my_pbo)
>>
>> Here's the portion i'm curious about:
>>
>> sr_base <- 0
>> mu_base <- sr_base/(252.0)
>> sigma_base <- 1.00/(252.0)**0.5
>> for ( i in 1:n ) {
>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>
>> Why is the data transformed within the for loop, and does this kind of
>> re-scaling and re-centering need to be done with r

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

2017-11-21 Thread Eric Berger

Correct 

Sent from my iPhone

> On 21 Nov 2017, at 22:42, Joe O  wrote:
> 
> Hi Eric,
> 
> Thank you, that helps a lot. If I'm understanding correctly, if I’m wanting 
> to use actual returns from backtests rather than simulated returns, I would 
> need to make sure my risk-adjusted return measure, sharpe ratio in this case, 
> matches up in scale with my returns (i.e. daily returns with daily sharpe, 
> monthly with monthly, etc). And I wouldn’t need to transform returns like the 
> simulated returns are in the vignette, as the real returns are going to have 
> whatever properties they have (meaning they will have whatever average and 
> std dev they happen to have). Is that correct? 
> 
> Thanks, -Joe
> 
> 
>> On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger  wrote:
>> [re-sending - previous email went out by accident before complete]
>> Hi Joe,
>> The centering and re-scaling is done for the purposes of his example, and 
>> also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree) parameter 
>> with a default value of .03/252 i.e. an ANNUAL 3% rate converted to a DAILY 
>> rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY 
>> returns, expressed in decimal.
>> 
>> Suppose he wanted to create random data from a distribution of returns with 
>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard 
>> deviation SIGMA_D =  SIGMA_A/SQRT(252).
>> 
>> He calls MU_D by the name mu_base  and  SIGMA_D by the name sigma_base.
>> 
>> His loop now converts the random numbers in his matrix so that each column 
>> has mean MU_D and std deviation SIGMA_D.
>> 
>> HTH,
>> Eric
>> 
>> 
>> 
>>> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger  wrote:
>>> Hi Joe,
>>> The centering and re-scaling is done for the purposes of his example, and 
>>> also to be consistent with his definition of the sharpe function.
>>> In particular, note that the sharpe function has the rf (riskfree) 
>>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted 
>>> to a DAILY rate, expressed in decimal.
>>> That means that the other argument to this function, x, should be DAILY 
>>> returns, expressed in decimal.
>>> 
>>> Suppose he wanted to create random data from a distribution of returns with 
>>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>>> The equivalent DAILY
>>> 
>>> Then he does two steps: (1) generate a matrix of random values from the 
>>> N(0,1) distribution. (2) convert them to DAILY
>>> After initializing the matrix with random values (from N(0,1)), he now 
>>> wants to create a series of DAILY
>>> sr_base <- 0
>>> mu_base <- sr_base/(252.0)
>>> sigma_base <- 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>> 
>>>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter  
>>>> wrote:
>>>> Wrong list.
>>>> 
>>>> Post on r-sig-finance instead.
>>>> 
>>>> Cheers,
>>>> Bert
>>>> 
>>>> 
>>>> 
>>>> On Nov 20, 2017 11:25 PM, "Joe O"  wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I'm trying to understand how to use the pbo package by looking at a
>>>> vignette. I'm curious about a part of the vignette that creates simulated
>>>> returns data. The package author transforms his simulated returns in a way
>>>> that I'm unfamiliar with, and that I haven't been able to find an
>>>> explanation for after searching around. I'm curious if I need to replicate
>>>> the transformation with real returns. For context, here is the vignette
>>>> (cleaned up a bit to make it reproducible):
>>>> 
>>>> (Full vignette:
>>>> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>>>> 
>>>> library(pbo)
>>>> #First, we assemble the trials into an NxT matrix where each column
>>>> #represents a trial and each trial has the same length T. This example
>>>> #is random data so the backtest should be overfit.`
>>>> 
>>>> set.seed(765)
>

Re: [R] Scatterplot of many variables against a single variable

2017-11-27 Thread Eric Berger

LOL. Great reply Jim.
(N.B. Jim's conclusion is "debatable" by a judicious choice of seed. e.g.
set.seed(79) suggests that making the request more readable will actually
lower the number of useful answers. :-))


On Mon, Nov 27, 2017 at 11:42 AM, Jim Lemon  wrote:

> Hi Engin,
> Sadly, your illustration was ambushed on the way to the list. Perhaps
> you want something like this:
>
> # proportion of useful answers to your request
> pua<-sort(runif(20))
> #legibility of your request
> lor<-sort(runif(20))+runif(20,-0.5,0.5)
> # is a data set provided?
> dsp<-sort(runif(20))+runif(20,-0.5,0.5)
> # generate a linear model for the above
> pua.lm<-lm(pua~lor+dsp)
> # get the coefficients
> pua.lm
>
> Call:
> lm(formula = pua ~ lor + dsp)
>
> Coefficients:
> (Intercept)  lor  dsp
> 0.1692   0.6132   0.3311
>
> plot(pua~lor,col="red",main="Proportion of useful answers by request
> quality")
> points(pua~dsp,col="blue",pch=2)
> abline(0.1692,0.6132,col="red")
> abline(0.1692,0.3311,col="blue")
>
> So, the more readable your request and the quality of the data that
> you provide, the more useful answers you are likely to receive.
>
> Jim
>
>
> On Mon, Nov 27, 2017 at 7:56 PM, Engin YILMAZ 
> wrote:
> > Dear
> >
> > I try to realize one scatter matrix which draws *one single variable to
> all
> > variables* with *regression line* . You can see my eviews version  in the
> > annex .
> >
> > How can I draw this graph with R studio?
> >
> >
> > Sincerely
> > Engin YILMAZ
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] DeSolve Package and Moving Average

2017-11-29 Thread Eric Berger

Since you only provide pseudo-code I will give a guess as to the source of
the problem.
It is easy to get "burned" by use of the ifelse statement. Its results have
the same "shape" as the first argument.
My suggestion is to try replacing ifelse by a standard

if (  ) {
} else {
}

HTH,
Eric



On Wed, Nov 29, 2017 at 1:29 PM, Werning, Jan-Philipp <
jan-philipp.wern...@whu.edu> wrote:

> Dear all,
>
>
> I am using the DeSolve Package to simulate a system dynamics model. At the
> problematic point in the model, I basically want to decide how many
> products shall be produced to be sold. In order to determine the amount a
> basic forecasting model of using the average of the last 12 time periods
> shall be used. My code looks like the following.
>
> “ […]
>
> # Time units in month
> START<-0; FINISH<-120; STEP<-1
>
> # Set seed for reproducability
>
>  set.seed(123)
>
> # Create time vector
> simtime  <- seq(START, FINISH, by=STEP)
>
> # Create a stock vector with initial values
> stocks   <- c([…])
>
> # Create an aux vector for the fixed aux values
> auxs<- c([…])
>
>
> model <- function(time, stocks, auxs){
>   with(as.list(c(stocks, auxs)),{
>
> [… “lots of aux, flow, and stock functions” … ]
>
>
> aMovingAverage  <-  ifelse(exists("ResultsSimulation")=="FALSE",
> 1,movavg(ResultsSimulation$TotalSales, 12, type = "s”))
>
>
> return (list(c([…]))
>
>   })
> }
>
> # Call Solver, and store results in a data frame
> ResultsSimulation <-  data.frame(ode(y=stocks, times=simtime, func = model,
>   parms=auxs, method="euler"))
>
> […]”
>
> My problem is, that the moving average (function: movavg) is only computed
> once and the same value is used in every timestep of the model. I.e. When
> running the model for the first time, 1 is used, running it for the
> next time the total sales value of the first timestep is used. Since only
> one timestep exists, this is logical. Yet  I would expect the movavg
> function to produce a new value in each of the 120 timesteps, as it is the
> case with all other flow, stock and aux calculations as well.
>
> It would be great if you could help me with fixing this problem.
>
>
> Many thanks in advance!
>
> Yours,
>
> Jan
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] source files in temp environment

2017-12-02 Thread Eric Berger

I totally agree with Duncan's last point. I find it hard to reconcile your
early remarks (which indicate a deep knowledge of programming) with the
idea that your code is not built up from combining small(ish) functions.
Small functions would generally be considered best practices. Try searching
on this topic to see discussions and pointers.

On Sat, Dec 2, 2017 at 1:01 PM, Duncan Murdoch 
wrote:

> On 02/12/2017 5:48 AM, Alexander Shenkin wrote:
>
>> Hi all,
>>
>> I often keep code in separate files for organizational purposes, and
>> source() that code from higher level scripts.  One problem is that those
>> sourced files often create temporary variables that I don't want to keep
>> around.  I could clean up after myself with lots of rm()'s, but that's a
>> pain, and is messy.
>>
>> I'm wondering if one solution might be to source the code in a temporary
>> environment, assign outputs of interest to the .GlobalEnv with <<-, and
>> then delete the environment afterwards.  One way to do this:
>>
>> file.r:
>> temp1 = 1
>> temp2 = 2
>> desired_var <<- temp1 + temp2
>>
>> console:
>> temp_e = new.env()
>> source("file.r", local = temp_e)
>> rm(temp_e)
>>
>> It's a bit messy to create and delete environments, so I tried what
>> others have referred to:
>>
>> source("file.r", local = attach(NULL))
>>
>> This, however, results in a persistent "NULL" environment in the search
>> path.
>>
>>   > search()
>> ".GlobalEnv""package:bindrcpp"  "NULL"
>> "tools:rstudio" "package:stats" "package:graphics"
>> "package:grDevices" "package:utils" "package:datasets"
>> "package:methods"   "Autoloads" "package:base"
>>
>> Of course, functions are built to encapsulate like this (and do so in
>> their own temporary environment), but in many cases, turning the sourced
>> code into functions is possible but clunky.
>>
>> Any thoughts or suggestions would be much appreciated.
>>
>
> I would wrap the calls in the local() function, or put them in a function
> and call that.  That is,
>
> local({
>   source("file.R", local = TRUE)
> })
>
> or
>
> sourceit <- function() {
>   source("file.R", local = TRUE)
> }
> sourceit()
>
> With respect to your last comment (turning the code in file.R into
> functions which don't leave their locals behind):  I think that would be
> the best solution.  You may find it clunky now, but in the long run it
> likely will help you to make better code.
>
> Duncan Murdoch
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcpp, dyn.load and C++ problems

2017-12-02 Thread Eric Berger

.Call("compute_values_cpp")
Also, if you were passing arguments to the C++ function you would need to
declare the function differently.
Do a search on "Rcpp calling C++ functions from R"

HTH,
Eric


On Sun, Dec 3, 2017 at 3:06 AM, Martin Møller Skarbiniks Pedersen <
traxpla...@gmail.com> wrote:

> Hi,
>
>   I have written a small C++ function and compile it.
>   However in R I can't see the function I have defined in C++.
>   I have read some web-pages about Rcpp and C++ but it is a bit confusion
> for me.
>
> Anyway,
>   This is the C++-code:
>
> #include 
> using namespace Rcpp;
>
> // [[Rcpp::export]]
> List compute_values_cpp(int totalPoints = 1e5, double angle_increment =
> 0.01, int radius = 400, double grow = 3.64) {
>   double xn = 0.5;
>   double angle = 0.1;
>   double xn_plus_one, yn_plus_one;
>   NumericVector x(totalPoints);
>   NumericVector y(totalPoints);
>
>   for (int i=0; i xn_plus_one = xn*cos(angle)*radius;
> yn_plus_one = xn*sin(angle)*radius;
> angle += angle_increment;
> xn = grow*xn*(1-xn);
> x[i] = xn_plus_one;
> y[i] = yn_plus_one;
>   }
>   return List::create(Rcpp::Named("x") = x, Rcpp::Named("y") = y);
> }
>
> And I compile it like this:
> PKG_CXXFLAGS=$(Rscript -e 'Rcpp:::CxxFlags()') \
> PKG_LIBS=$(Rscript -e 'Rcpp:::LdFlags()')  \
> R CMD SHLIB logistic_map.cpp
> without problems and I get a logistic_map.so file as expected.
>
> However in R:
> R> dyn.load("logistic_map.so")
> R> compute_values_cpp()
> Error in compute_values_cpp() :
>   could not find function "compute_values_cpp"
>
> Please advise,
>   What piece of the puzzle is missing?
>
> Regards
> Martin M. S. Pedersen
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with the behaviour of dashed lines in R plots

2017-12-04 Thread Eric Berger

Hi,
Sarah's last comment about using min/max x got me thinking. It's not that
the points are "very close together", it's that the x-values are not
ordered. So the plot is actually drawing a dashed line back-and-forth
between different points on the line, which has the effect of making the
result appear non-dashed. If you sort by the x-values before plotting you
will see the output is very different. I am not saying this is a better
solution than Sarah's regarding just using the end-points, but at least it
partially explains the relevance of that suggestion.

For example, here is a slightly modified version of the code that does an
ordering before plotting:

df1<-data.frame(B=runif(20,1.4,1.6),A=runif(20,-19.5,-9.8))
regressor<-lm(A~B,data = df1)
df2 <- data.frame( x=df1$B, yhat= predict(regressor,df1))
df2 <- df2[ order(df2$x), ]
plot(df2$x,df2$yhat,type="l", col="black", mgp=c(2,0.5,0),cex.lab=1.6,
lwd=2, lty=2,xlim=range(c(1.2,1.7)),ylim=rev(range(c(-19,-8
par(new = TRUE)
plot(df1$B,as.numeric(df1$A),type="p", col="black",
mgp=c(2,0.5,0),cex.lab=1.6,cex=2, xlab = "", ylab =
"",xlim=range(c(1.2,1.7)),ylim=rev(range(c(-19,-8))),pch=17)
box(lwd=3)

HTH,
Eric

On Mon, Dec 4, 2017 at 8:30 PM, jean-philippe <
jeanphilippe.fonta...@gssi.infn.it> wrote:

> hi Sarah,
>
> Thanks a lot for having taken time to answer me and for your reply. I
> wonder how I missed this solution. Indeed plotting the line with the 2
> extreme data points works perfectly.
>
>
> Best,
>
>
> Jean-Philippe Fontaine
>
>
>
> On 04/12/2017 18:30, Sarah Goslee wrote:
>
>> It's because you are plotting a line between each of the points in
>> your data frame, and they are very close togethe
>>
>
> --
> Jean-Philippe Fontaine
> PhD Student in Astroparticle Physics,
> Gran Sasso Science Institute (GSSI),
> Viale Francesco Crispi 7,
> 67100 L'Aquila, Italy
> Mobile: +393487128593, +33615653774
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Curiously short cycles in iterated permutations with the same seed

2017-12-07 Thread Eric Berger

Hi Boris,
Do a search on "the order of elements of the symmetric group". (This search
will also turn up homework questions and solutions.) You will understand
why you are seeing this once you understand how a permutation is decomposed
into cycles and how the order relates to a partition of n (n=10 in your
case).

Enjoy!
Eric

On Fri, Dec 8, 2017 at 6:39 AM, Boris Steipe 
wrote:

> I have noticed that when I iterate permutations of short vectors with the
> same seed, the cycle lengths are much shorter than I would expect by
> chance. For example:
>
> X <- 1:10
> Xorig <- X
> start <- 112358
> N <- 10
>
> for (i in 1:N) {
>   seed <- start + i
>   for (j in 1:1000) { # Maximum cycle length to consider
> set.seed(seed)# Re-seed RNG to same initial state
> X <- sample(X)# Permute X and iterate
> if (all(X == Xorig)) {
>   cat(sprintf("Seed:\t%d\tCycle: %d\n", seed, j))
>   break()
> }
>   }
> }
>
> Seed:   112359  Cycle: 14
> Seed:   112360  Cycle: 14
> Seed:   112361  Cycle: 8
> Seed:   112362  Cycle: 14
> Seed:   112363  Cycle: 8
> Seed:   112364  Cycle: 10
> Seed:   112365  Cycle: 10
> Seed:   112366  Cycle: 10
> Seed:   112367  Cycle: 9
> Seed:   112368  Cycle: 12
>
> I understand that I am performing the same permutation operation over and
> over again - but I don't see why that would lead to such a short cycle (in
> fact the cycle for the first 100,000 seeds is never longer than 30). Does
> this have a straightforward explanation?
>
>
> Thanks!
> Boris
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between ifelse and if...else?

2017-12-13 Thread Eric Berger

ifelse returns the "shape" of the first argument

In your ifelse the shape of "3 > 2" is a vector of length one, so it will
return a vector length one.

Avoid "ifelse" until you are very comfortable with it. It can often burn
you.

On Wed, Dec 13, 2017 at 5:33 PM, jeremiah rounds 
wrote:

> ifelse is vectorized.
>
> On Wed, Dec 13, 2017 at 7:31 AM, Jinsong Zhao  wrote:
>
> > Hi there,
> >
> > I don't know why the following codes are return different results.
> >
> > > ifelse(3 > 2, 1:3, length(1:3))
> > [1] 1
> > > if (3 > 2) 1:3 else length(1:3)
> > [1] 1 2 3
> >
> > Any hints?
> >
> > Best,
> > Jinsong
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posti
> > ng-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with recursive function

2017-12-14 Thread Eric Berger

You seem to have a typo at this expression (and some others like it)

Namely, you write

any(!dat2$norm_sd) >= 1

when you possibly meant to write

!( any(dat2$norm_sd) >= 1 )

i.e. I think your ! seems to be in the wrong place.

HTH,
Eric


On Thu, Dec 14, 2017 at 3:26 PM, DIGHE, NILESH [AG/2362] <
nilesh.di...@monsanto.com> wrote:

> Hi, I need some help with running a recursive function. I like to run
> funlp2 recursively.
> When I try to run recursive function in another function named "calclp" I
> get this "Error: any(!dat2$norm_sd) >= 1 is not TRUE".
>
> I have never built a recursive function before so having trouble executing
> it in this case.  I would appreciate any help or guidance to resolve this
> issue. Please see my data and the three functions that I am using below.
> Please note that calclp is the function I am running and the other two
> functions are within this calclp function.
>
> # code:
> Test<- calclp(dataset = dat)
>
> # calclp function
>
> calclp<- function (dataset)
>
> {
>
> dat1 <- funlp1(dataset)
>
> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>
> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>
> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>
> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
> grep("_",
>
> names(.)), funs(norm = round(scale(.), 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> stopifnot(any(!dat2$norm_sd) >= 1)
>
> if (any(!dat2$norm_sd) >= 1) {
>
> df1 <- dat1
>
> return(df1)
>
> }
>
> else {
>
> df2 <- recursive_funlp()
>
> return(df2)
>
> }
>
> }
>
> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>
> df3
>
> }
>
>
> # funlp1 function
>
> funlp1<- function (dataset)
>
> {
>
> dat2 <- dataset %>% select(field, set, ent_num, rep_num,
>
> lp) %>% unite(uniqueid, set, ent_num, sep = ".") %>%
>
> unite(field_rep, field, rep_num) %>% mutate(field_rep =
> paste(field_rep,
>
> "lp", sep = ".")) %>% spread(key = field_rep, value = lp) %>%
>
> mutate_at(.vars = grep("_", names(.)), funs(norm = round(scale(.),
>
> 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> data1 <- dat2 %>% gather(key, value, -uniqueid, -norm_max,
>
> -norm_sd) %>% separate(key, c("field_rep", "treatment"),
>
> "\\.") %>% spread(treatment, value) %>% mutate(outlier = NA)
>
> df_clean <- with(data1, data1[norm_sd < 1, ])
>
> datD <- with(data1, data1[norm_sd >= 1, ])
>
> s <- split(datD, datD$uniqueid)
>
> sdf <- lapply(s, function(x) {
>
> data.frame(x, x$outlier <- ifelse(is.na(x$lp_norm), NA,
>
> ifelse(abs(x$lp_norm) == x$norm_max, "yes", "no")),
>
> x$lp <- with(x, ifelse(outlier == "yes", NA, lp)))
>
> x
>
> })
>
> sdf2 <- bind_rows(sdf)
>
> all_dat <- bind_rows(df_clean, sdf2)
>
> all_dat
>
> }
>
>
> # funlp2 function
>
> funlp2<-function (dataset)
>
> {
>
> data1 <- dataset
>
> df_clean <- with(data1, data1[norm_sd < 1, ])
>
> datD <- with(data1, data1[norm_sd >= 1, ])
>
> s <- split(datD, datD$uniqueid)
>
> sdf <- lapply(s, function(x) {
>
> data.frame(x, x$outlier <- ifelse(is.na(x$lp_norm), NA,
>
> ifelse(abs(x$lp_norm) == x$norm_max, "yes", "no")),
>
> x$lp <- with(x, ifelse(outlier == "yes", NA, lp)))
>
> x
>
> })
>
> sdf2 <- bind_rows(sdf)
>
> all_dat <- bind_rows(df_clean, sdf2)
>
> all_dat
>
> }
>
>
> # dataset
> dput(dat)
> structure(list(field = c("LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01",
> "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "LM01", "OL01",
> "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01",
> "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01",
> "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01",
> "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01", "OL01",
> "OL01", "OL

Re: [R] help with recursive function

2017-12-14 Thread Eric Berger

My own typo ... whoops ...

!( any(dat2$norm_sd >= 1 ))



On Thu, Dec 14, 2017 at 3:43 PM, Eric Berger  wrote:

> You seem to have a typo at this expression (and some others like it)
>
> Namely, you write
>
> any(!dat2$norm_sd) >= 1
>
> when you possibly meant to write
>
> !( any(dat2$norm_sd) >= 1 )
>
> i.e. I think your ! seems to be in the wrong place.
>
> HTH,
> Eric
>
>
> On Thu, Dec 14, 2017 at 3:26 PM, DIGHE, NILESH [AG/2362] <
> nilesh.di...@monsanto.com> wrote:
>
>> Hi, I need some help with running a recursive function. I like to run
>> funlp2 recursively.
>> When I try to run recursive function in another function named "calclp" I
>> get this "Error: any(!dat2$norm_sd) >= 1 is not TRUE".
>>
>> I have never built a recursive function before so having trouble
>> executing it in this case.  I would appreciate any help or guidance to
>> resolve this issue. Please see my data and the three functions that I am
>> using below.
>> Please note that calclp is the function I am running and the other two
>> functions are within this calclp function.
>>
>> # code:
>> Test<- calclp(dataset = dat)
>>
>> # calclp function
>>
>> calclp<- function (dataset)
>>
>> {
>>
>> dat1 <- funlp1(dataset)
>>
>> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>>
>> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>>
>> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>>
>> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
>> grep("_",
>>
>> names(.)), funs(norm = round(scale(.), 3)))
>>
>> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, sd, na.rm = TRUE), 3)
>>
>> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, function(x) {
>>
>> max(abs(x), na.rm = TRUE)
>>
>> }), 3)
>>
>> stopifnot(any(!dat2$norm_sd) >= 1)
>>
>> if (any(!dat2$norm_sd) >= 1) {
>>
>> df1 <- dat1
>>
>> return(df1)
>>
>> }
>>
>> else {
>>
>> df2 <- recursive_funlp()
>>
>> return(df2)
>>
>> }
>>
>> }
>>
>> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>>
>> df3
>>
>> }
>>
>>
>> # funlp1 function
>>
>> funlp1<- function (dataset)
>>
>> {
>>
>> dat2 <- dataset %>% select(field, set, ent_num, rep_num,
>>
>> lp) %>% unite(uniqueid, set, ent_num, sep = ".") %>%
>>
>> unite(field_rep, field, rep_num) %>% mutate(field_rep =
>> paste(field_rep,
>>
>> "lp", sep = ".")) %>% spread(key = field_rep, value = lp) %>%
>>
>> mutate_at(.vars = grep("_", names(.)), funs(norm = round(scale(.),
>>
>> 3)))
>>
>> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, sd, na.rm = TRUE), 3)
>>
>> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, function(x) {
>>
>> max(abs(x), na.rm = TRUE)
>>
>> }), 3)
>>
>> data1 <- dat2 %>% gather(key, value, -uniqueid, -norm_max,
>>
>> -norm_sd) %>% separate(key, c("field_rep", "treatment"),
>>
>> "\\.") %>% spread(treatment, value) %>% mutate(outlier = NA)
>>
>> df_clean <- with(data1, data1[norm_sd < 1, ])
>>
>> datD <- with(data1, data1[norm_sd >= 1, ])
>>
>> s <- split(datD, datD$uniqueid)
>>
>> sdf <- lapply(s, function(x) {
>>
>> data.frame(x, x$outlier <- ifelse(is.na(x$lp_norm), NA,
>>
>> ifelse(abs(x$lp_norm) == x$norm_max, "yes", "no")),
>>
>> x$lp <- with(x, ifelse(outlier == "yes", NA, lp)))
>>
>> x
>>
>> })
>>
>> sdf2 <- bind_rows(sdf)
>>
>> all_dat <- bind_rows(df_clean, sdf2)
>>
>> all_dat
>>
>> }
>>
>>

Re: [R] help with recursive function

2017-12-14 Thread Eric Berger

The message is coming from your stopifnot() condition being met.


On Thu, Dec 14, 2017 at 5:31 PM, DIGHE, NILESH [AG/2362] <
nilesh.di...@monsanto.com> wrote:

> Hi, I accidently left out few lines of code from the calclp function.
> Updated function is pasted below.
>
> I am still getting the same error “Error: !(any(data1$norm_sd >= 1)) is
> not TRUE“
>
>
>
> I would appreciate any help.
>
> Nilesh
>
> dput(calclp)
>
> function (dataset)
>
> {
>
> dat1 <- funlp1(dataset)
>
> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>
> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>
> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>
> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
> grep("_",
>
> names(.)), funs(norm = round(scale(.), 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> data1 <- dat2 %>% gather(key, value, -uniqueid, -norm_max,
>
> -norm_sd) %>% separate(key, c("field_rep", "treatment"),
>
> "\\.") %>% spread(treatment, value) %>% mutate(outlier = NA)
>
> stopifnot(!(any(data1$norm_sd >= 1)))
>
> if (!(any(data1$norm_sd >= 1))) {
>
> df1 <- dat1
>
> return(df1)
>
> }
>
>else {
>
> df2 <- recursive_funlp()
>
> return(df2)
>
> }
>
> }
>
> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>
> df3
>
> }
>
>
>
>
>
> *From:* DIGHE, NILESH [AG/2362]
> *Sent:* Thursday, December 14, 2017 9:01 AM
> *To:* 'Eric Berger' 
> *Cc:* r-help 
> *Subject:* RE: [R] help with recursive function
>
>
>
> Eric:  Thanks for taking time to look into my problem.  Despite of making
> the change you suggested, I am still getting the same error.  I am
> wondering if the logic I am using in the stopifnot and if functions is a
> problem.
>
> I like the recursive function to stop whenever the norm_sd column has zero
> values that are above or equal to 1. Below is the calclp function after the
> changes you suggested.
>
> Thanks. Nilesh
>
>
>
> dput(calclp)
>
> function (dataset)
>
> {
>
> dat1 <- funlp1(dataset)
>
> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>
> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>
> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>
> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
> grep("_",
>
> names(.)), funs(norm = round(scale(.), 3)))
>
> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, sd, na.rm = TRUE), 3)
>
> dat2$norm_max <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>
> 1, function(x) {
>
> max(abs(x), na.rm = TRUE)
>
> }), 3)
>
> stopifnot(!(any(dat2$norm_sd >= 1)))
>
>     if (!(any(dat2$norm_sd >= 1))) {
>
> df1 <- dat1
>
> return(df1)
>
> }
>
> else {
>
> df2 <- recursive_funlp()
>
> return(df2)
>
> }
>
> }
>
> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>
> df3
>
> }
>
>
>
>
>
> *From:* Eric Berger [mailto:ericjber...@gmail.com ]
>
> *Sent:* Thursday, December 14, 2017 8:17 AM
> *To:* DIGHE, NILESH [AG/2362] 
> *Cc:* r-help 
> *Subject:* Re: [R] help with recursive function
>
>
>
> My own typo ... whoops ...
>
>
>
> !( any(dat2$norm_sd >= 1 ))
>
>
>
>
>
>
>
> On Thu, Dec 14, 2017 at 3:43 PM, Eric Berger 
> wrote:
>
> You seem to have a typo at this expression (and some others like it)
>
>
>
> Namely, you write
>
>
>
> any(!dat2$norm_sd) >= 1
>
>
>
> when you possibly meant to write
>
>
>
> !( any(dat2$norm_sd) >= 1 )
>
>
>
> i.e. I think your ! seems to be in the wrong place.
>
>
>
> HTH,
> Eric
>
>
>
>
>
> On Thu

Re: [R] help with recursive function

2017-12-14 Thread Eric Berger

If you are trying to understand why the "stopifnot" condition is met you
can replace it by something like:

if ( any(dat2$norm_sd >= 1) )
   browser()

This will put you in a debugging session where you can examine your
variables, e.g.

> dat$norm_sd

HTH,
Eric



On Thu, Dec 14, 2017 at 5:33 PM, Eric Berger  wrote:

> The message is coming from your stopifnot() condition being met.
>
>
> On Thu, Dec 14, 2017 at 5:31 PM, DIGHE, NILESH [AG/2362] <
> nilesh.di...@monsanto.com> wrote:
>
>> Hi, I accidently left out few lines of code from the calclp function.
>> Updated function is pasted below.
>>
>> I am still getting the same error “Error: !(any(data1$norm_sd >= 1)) is
>> not TRUE“
>>
>>
>>
>> I would appreciate any help.
>>
>> Nilesh
>>
>> dput(calclp)
>>
>> function (dataset)
>>
>> {
>>
>> dat1 <- funlp1(dataset)
>>
>> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>>
>> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>>
>> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>>
>> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
>> grep("_",
>>
>> names(.)), funs(norm = round(scale(.), 3)))
>>
>> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, sd, na.rm = TRUE), 3)
>>
>> dat2$norm_max <- round(apply(dat2[, grep("lp_norm",
>> names(dat2))],
>>
>> 1, function(x) {
>>
>> max(abs(x), na.rm = TRUE)
>>
>> }), 3)
>>
>> data1 <- dat2 %>% gather(key, value, -uniqueid, -norm_max,
>>
>> -norm_sd) %>% separate(key, c("field_rep", "treatment"),
>>
>> "\\.") %>% spread(treatment, value) %>% mutate(outlier = NA)
>>
>> stopifnot(!(any(data1$norm_sd >= 1)))
>>
>> if (!(any(data1$norm_sd >= 1))) {
>>
>> df1 <- dat1
>>
>> return(df1)
>>
>> }
>>
>>else {
>>
>> df2 <- recursive_funlp()
>>
>> return(df2)
>>
>> }
>>
>> }
>>
>> df3 <- recursive_funlp(dataset = dat1, func = funlp2)
>>
>> df3
>>
>> }
>>
>>
>>
>>
>>
>> *From:* DIGHE, NILESH [AG/2362]
>> *Sent:* Thursday, December 14, 2017 9:01 AM
>> *To:* 'Eric Berger' 
>> *Cc:* r-help 
>> *Subject:* RE: [R] help with recursive function
>>
>>
>>
>> Eric:  Thanks for taking time to look into my problem.  Despite of making
>> the change you suggested, I am still getting the same error.  I am
>> wondering if the logic I am using in the stopifnot and if functions is a
>> problem.
>>
>> I like the recursive function to stop whenever the norm_sd column has
>> zero values that are above or equal to 1. Below is the calclp function
>> after the changes you suggested.
>>
>> Thanks. Nilesh
>>
>>
>>
>> dput(calclp)
>>
>> function (dataset)
>>
>> {
>>
>> dat1 <- funlp1(dataset)
>>
>> recursive_funlp <- function(dataset = dat1, func = funlp2) {
>>
>> dat2 <- dataset %>% select(uniqueid, field_rep, lp) %>%
>>
>> mutate(field_rep = paste(field_rep, "lp", sep = ".")) %>%
>>
>> spread(key = field_rep, value = lp) %>% mutate_at(.vars =
>> grep("_",
>>
>> names(.)), funs(norm = round(scale(.), 3)))
>>
>> dat2$norm_sd <- round(apply(dat2[, grep("lp_norm", names(dat2))],
>>
>> 1, sd, na.rm = TRUE), 3)
>>
>> dat2$norm_max <- round(apply(dat2[, grep("lp_norm",
>> names(dat2))],
>>
>> 1, function(x) {
>>
>> max(abs(x), na.rm = TRUE)
>>
>> }), 3)
>>
>> stopifnot(!(any(dat2$norm_sd >= 1)))
>>
>> if (!(any(dat2$norm_sd >= 1))) {
>>
>> df1 <- dat1
>>
>> return(df1)
>>
>> }
>>
>> else {
>>
>> df2 <- recursive_funlp()
>>
>> return(df2)
>>
>>

Re: [R] Finding center of mass in a hydrologic time series

2017-12-16 Thread Eric Berger

Hi Eric,
How about

match( TRUE, cumsum(hyd/sum(hyd)) > .5 ) - 1

HTH,
Eric


On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric  wrote:

> The small bit of script below is an example of what I'm attempting to do -
> find the day on which the 'center of mass' occurs.  In case that is the
> wrong term, I'd like to know the day that essentially cuts the area under
> the curve in to two equal parts:
>
> set.seed(4004)
> Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day')
> hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
> seq(45,1,length.out=30)) + rnorm(30)*8) - 800
>
> # View the example curve
> plot(Date, hyd, las=1)
>
> # By trial-and-error, the day on which the center of mass occurs is the
> 11th day:
> # Add up the area under the curve for the first 11 days and compare
> # with the last 19 days:
>
> sum(hyd[1:11])
> # 3546.364
> sum(hyd[12:30])
> # 3947.553
>
> # Add up the area under the curve for the first 12 days and compare
> # with the last 18 days:
>
> sum(hyd[1:12])
> # 3875.753
> sum(hyd[13:30])
> # 3618.164
>
> By day 12, the halfway point has already been passed, so the answer that
> would be returned would be:
>
> Date[11]
> # "2000-09-11"
>
> For the larger problem, it'd be handy if the proposed function could
> process a multi-year time series (a runoff hydrograph) and return the day
> of the center of mass for each year in the time series.
>
> I appreciate any pointers...Eric
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Auto Data in the ISLR Package

2017-12-17 Thread Eric Berger

myAuto <- Auto[ grep("ford|toyota",Auto$name),]



On Sat, Dec 16, 2017 at 10:28 PM, Bert Gunter 
wrote:

> I did not care to load the packages -- small reproducible examples are
> preferable, as the posting guide suggests.
>
> But, if I have understood correctly:
>
> See, e.g. ?subset
>
> Alternatively, you can read up on indexing data frames in any good basic R
> tutorial.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Sat, Dec 16, 2017 at 11:44 AM, AbouEl-Makarim Aboueissa <
> abouelmakarim1...@gmail.com> wrote:
>
> > Dear All:
> >
> > I would like to create a subset data set *with only* all Ford and all
> > Toyota cars from the Auto data set  in ISLR R Package.  Thank you very
> much
> > in advance.
> >
> > Please use the following code to see how is the data look like.
> >
> >
> > install.packages("ISLR")
> > library(ISLR)
> > data(Auto)
> > head(Auto)
> >
> >
> > with many thanks
> > abou
> > __
> >
> >
> > *AbouEl-Makarim Aboueissa, PhD*
> >
> > *Professor of Statistics*
> >
> > *Department of Mathematics and Statistics*
> > *University of Southern Maine*
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Auto Data in the ISLR Package

2017-12-17 Thread Eric Berger

Hi Peter,
I looked at the Auto data frame and tested before I sent my reply. The
entries in the "name" column are actually models, such as

> head(Auto$name)
[1] chevrolet chevelle malibu buick skylark 320 plymouth satellite
  amc rebel sst
[5] ford torino   ford galaxie 500

What you are suggesting won't work. I agree with your "bedford" example as
a problem, but given the size of the result set in this case (~73 rows)
it's easy to eyeball the results and see if they're ok.

Regards,
Eric


On Sun, Dec 17, 2017 at 11:00 AM, peter dalgaard  wrote:

> That probably works in this case, but it would cause grief if another car
> make had "ford" somewhere inside its name e.g. "bedford". Safer general
> practice is
>
> Auto[Auto$name %in% c("ford", "toyota"),]
>
> or similar using subset().
>
> -pd
>
> > On 17 Dec 2017, at 09:10 , Eric Berger  wrote:
> >
> > myAuto <- Auto[ grep("ford|toyota",Auto$name),]
> >
> >
> >
> > On Sat, Dec 16, 2017 at 10:28 PM, Bert Gunter 
> > wrote:
> >
> >> I did not care to load the packages -- small reproducible examples are
> >> preferable, as the posting guide suggests.
> >>
> >> But, if I have understood correctly:
> >>
> >> See, e.g. ?subset
> >>
> >> Alternatively, you can read up on indexing data frames in any good
> basic R
> >> tutorial.
> >>
> >> Cheers,
> >> Bert
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and
> >> sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Sat, Dec 16, 2017 at 11:44 AM, AbouEl-Makarim Aboueissa <
> >> abouelmakarim1...@gmail.com> wrote:
> >>
> >>> Dear All:
> >>>
> >>> I would like to create a subset data set *with only* all Ford and all
> >>> Toyota cars from the Auto data set  in ISLR R Package.  Thank you very
> >> much
> >>> in advance.
> >>>
> >>> Please use the following code to see how is the data look like.
> >>>
> >>>
> >>> install.packages("ISLR")
> >>> library(ISLR)
> >>> data(Auto)
> >>> head(Auto)
> >>>
> >>>
> >>> with many thanks
> >>> abou
> >>> __
> >>>
> >>>
> >>> *AbouEl-Makarim Aboueissa, PhD*
> >>>
> >>> *Professor of Statistics*
> >>>
> >>> *Department of Mathematics and Statistics*
> >>> *University of Southern Maine*
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/
> >>> posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Auto Data in the ISLR Package

2017-12-17 Thread Eric Berger

myAuto <- Auto[ grep("ford|toyota",Auto$name),]
myAuto$Make <- NA
myAuto$Make[grep("ford",myAuto$name)] <- "Ford"
myAuto$Make[grep("toyota",myAuto$name)] <- "Toyota"

Regards,
Eric


On Sun, Dec 17, 2017 at 11:58 AM, AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear Eric:
>
> Thank you very much. It works nicely.
>
> *Just one more thing;* how to create a new variable (say, *Make*) with *Make
> = Ford* for the ford brand and *Make = T**oyota* for the toyota brand.
>
> Once again thank you all.
>
> abou
>
> __
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor of Statistics*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
>
> On Sun, Dec 17, 2017 at 3:10 AM, Eric Berger 
> wrote:
>
>> myAuto <- Auto[ grep("ford|toyota",Auto$name),]
>>
>>
>>
>> On Sat, Dec 16, 2017 at 10:28 PM, Bert Gunter 
>> wrote:
>>
>>> I did not care to load the packages -- small reproducible examples are
>>> preferable, as the posting guide suggests.
>>>
>>> But, if I have understood correctly:
>>>
>>> See, e.g. ?subset
>>>
>>> Alternatively, you can read up on indexing data frames in any good basic
>>> R
>>> tutorial.
>>>
>>> Cheers,
>>> Bert
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and
>>> sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>> On Sat, Dec 16, 2017 at 11:44 AM, AbouEl-Makarim Aboueissa <
>>> abouelmakarim1...@gmail.com> wrote:
>>>
>>> > Dear All:
>>> >
>>> > I would like to create a subset data set *with only* all Ford and all
>>> > Toyota cars from the Auto data set  in ISLR R Package.  Thank you very
>>> much
>>> > in advance.
>>> >
>>> > Please use the following code to see how is the data look like.
>>> >
>>> >
>>> > install.packages("ISLR")
>>> > library(ISLR)
>>> > data(Auto)
>>> > head(Auto)
>>> >
>>> >
>>> > with many thanks
>>> > abou
>>> > __
>>> >
>>> >
>>> > *AbouEl-Makarim Aboueissa, PhD*
>>> >
>>> > *Professor of Statistics*
>>> >
>>> > *Department of Mathematics and Statistics*
>>> > *University of Southern Maine*
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > __
>>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide http://www.R-project.org/
>>> > posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding center of mass in a hydrologic time series

2017-12-18 Thread Eric Berger

Hi Eric,
the following works for me.

HTH,
Eric

library(EGRET)

StartDate <- "1990-10-01"
EndDate <- "2017-09-30"
siteNumber <- "1031"
QParameterCd <- "00060"

Daily <- readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)

# Define 'center of mass' function
com <- function(x) {
  match(TRUE, cumsum(x/sum(x)) > 0.5) - 1
}
wyrs <- unique(Daily$waterYear)
x <- as.Date(sapply( wyrs, function(yr) { Df <-
Daily[Daily$waterYear==yr,];  Df$Date[com(Df$Q)] } ), "1970-01-01")



On Mon, Dec 18, 2017 at 4:47 PM, Morway, Eric  wrote:

> Eric B's response provided just the kind of quick & simple solution I was
> hoping for (appears as the function com below).  However, I once again
> failed to take advantage of the power of R and have reverted back to using
> a for loop for the next step of the processing.  The example below (which
> requires the library EGRET for pulling an example dataset) works, but
> probably can be replaced with some version of the apply functionality?  So
> far, I've been unable to figure out how to enter the arguments to the apply
> function.  The idea is this: for each unique water year (variable 'wyrs' in
> example below) in a 27 year continuous time series of daily values, find
> the date of the 'center of mass', and build a vector of those dates.
> Thanks, -Eric M
>
> library(EGRET)
>
> StartDate <- "1990-10-01"
> EndDate <- "2017-09-30"
> siteNumber <- "1031"
> QParameterCd <- "00060"
>
> Daily <- readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
>
> # Define 'center of mass' function
> com <- function(x) {
> match(TRUE, cumsum(x/sum(x)) > 0.5) - 1
> }
>
>
> wyrs <- unique(Daily$waterYear)
> for(i in (1:length(wyrs))){
> OneYr <- Daily[Daily$waterYear==wyrs[i], ]
> mid <- com(OneYr$Q)
> if(i==1){
> midPts <- as.Date(OneYr$Date[mid])
> } else {
> midPts <- c(midPts, as.Date(OneYr$Date[mid]))
> }
> }
>
>
>
> Eric Morway
> Research Hydrologist
> Nevada Water Science Center
> U.S. Geological Survey
> 2730 N. Deer Run Rd.
> <https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>
> Carson City, NV 89701
> <https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>
> (775
> <https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>)
> 887-7668
> *orcid*:  -0002-8553-6140 <http://orcid.org/-0002-8553-6140>
>
>
>
> On Sat, Dec 16, 2017 at 5:32 AM, Eric Berger 
> wrote:
>
>> Hi Eric,
>> How about
>>
>> match( TRUE, cumsum(hyd/sum(hyd)) > .5 ) - 1
>>
>> HTH,
>> Eric
>>
>>
>> On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric  wrote:
>>
>>> The small bit of script below is an example of what I'm attempting to do
>>> -
>>> find the day on which the 'center of mass' occurs.  In case that is the
>>> wrong term, I'd like to know the day that essentially cuts the area under
>>> the curve in to two equal parts:
>>>
>>> set.seed(4004)
>>> Date <- seq(as.Date('2000-09-01'), as.Date('2000-09-30'), by='day')
>>> hyd <- ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
>>> seq(45,1,length.out=30)) + rnorm(30)*8) - 800
>>>
>>> # View the example curve
>>> plot(Date, hyd, las=1)
>>>
>>> # By trial-and-error, the day on which the center of mass occurs is the
>>> 11th day:
>>> # Add up the area under the curve for the first 11 days and compare
>>> # with the last 19 days:
>>>
>>> sum(hyd[1:11])
>>> # 3546.364
>>> sum(hyd[12:30])
>>> # 3947.553
>>>
>>> # Add up the area under the curve for the first 12 days and compare
>>> # with the last 18 days:
>>>
>>> sum(hyd[1:12])
>>> # 3875.753
>>> sum(hyd[13:30])
>>> # 3618.164
>>>
>>> By day 12, the halfway point has already been passed, so the answer that
>>> would be returned would be:
>>>
>>> Date[11]
>>> # "2000-09-11"
>>>
>>> For the larger problem, it'd be handy if the proposed function could
>>> process a multi-year time series (a runoff hydrograph) and return the day
>>> of the center of mass for each year in the time series.
>>>
>>> I appreciate any pointers...Eric
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with script

2017-12-28 Thread Eric Berger

Hi Pablo,
There are probably many ways to do this in R. This suggestion uses dplyr.
The solution is actually only one line (see the line starting with dat2).
The first section simply creates the example data.

library(dplyr)
# 1. set up the example data
m <- matrix( c(0,0,0,0,0,1,1,1,0,0,1,1,1,1,2,1,1,2,0,1,2,2,2,1,0,1,1,1),
nrow=4)
dat <- as.data.frame(m)
dat$ID <- c("a1","a2","a2","a3")
dat <- dat[,c(8,1:7)]
colnames(dat) <- c("ID",LETTERS[1:7])

#2. group the data by ID, summing the columns in each group
dat2 <- group_by(dat,ID) %>% summarise_all( sum )

#3. show the results
dat2

# # A tibble: 3 x 8
#  ID A B C D E F G
#  
# 1a1 0 0 0 1 1 2 0
# 2a2 0 2 1 3 2 4 2
# 3a3 0 1 1 1 1 1 1

HTH,
Eric


On Fri, Dec 29, 2017 at 2:03 AM, PABLO ORTIZ PINEDA 
wrote:

> Hello there. Happy new year for everyone!
>
> I need help with a table. This table contains 300 rows and 192 columns.
> Being the first column the ID of my samples that can have several
> observations.
>
> I need to generate e NEW table that contains a single ID with the sum of
> the observations by columns:
> For example:
>
> Example
> ID   ABCDEFG 191 columns
> a1   0001120...
> a2   0101221...
> a2   0112021...
> a3   0111111
> ...300rows
> In this case I want to make a new table in which there is only 1 ID and
> the values of each columns A...G are added. I
> n the example the new table would have only 3 IDs. a1, a2 and 3 and a2
> has the values added by column:
> a2   0   2   1   3   2   4   2..
>
> Thank you so much and have a wonderful year!.
>
> --
> Pablo A. Ortiz-Pineda (Ph.D.)
> Molecular Biology & Bioinformatics
> Yale University. School of Medicine.
> Pediatrics Department.
> New Haven, CT 06510
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing text files out of a dataset

2017-12-29 Thread Eric Berger

You have an error with the filename in the loop.
Try replacing the relevant line wtih
fileConn<-file(sprintf("TESTI/%d.txt",i))

HTH,
Eric

On Fri, Dec 29, 2017 at 4:31 PM, Luca Meyer  wrote:

> Hello,
>
> I am trying to run the following syntax for all cases within the dataframe
> "data"
>
> d1 <- data[1,c("material")]
> fileConn<-file("TESTI/d1.txt")
> writeLines(d1, fileConn)
> close(fileConn)
>
> I am trying to use the for function:
>
>  for (i in 1:nrow(data)){
>   d[i] <- data[i,c("material")]
>   fileConn<-file("TESTI/d[i].txt")
>   writeLines(d[i], fileConn)
>   close(fileConn)
> }
>
> but I get the error:
>
> Object "d" not found
>
> Any suggestion on how I can solve the above?
>
> Thanks,
>
> Luca
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discrete valued time series data sets.

2018-01-02 Thread Eric Berger

Hi Rolf,
I looked at
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-public-data-sets

One of the first sets in the list is the airline time series (I think it is
also used in dplyr examples).

https://www.transtats.bts.gov/OT_Delay/OT_DelayCause1.asp

You might find other possibilities in that list.

HTH,
Eric

On Tue, Jan 2, 2018 at 12:44 AM, Rolf Turner 
wrote:

>
> I am looking for (publicly available) examples of discrete valued time
> series data sets.  I have googled around a bit and have found lots of
> articles and books on discrete valued time series, but have had no success
> in locating sites at which data are available.
>
> Can anyone make any useful suggestions?
>
> Thanks.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replace NAs in split lists

2018-01-08 Thread Eric Berger

You can enforce these assumptions by sorting on multiple columns, which
leads to

na.locf(df1[ order(df1$ID,df1$Value), ])



On Mon, Jan 8, 2018 at 4:19 PM, Jeff Newmiller 
wrote:

> Yes, you are right if the IDs are always sequentially-adjacent and the
> first non-NA value appears in the first record for each ID.
> --
> Sent from my phone. Please excuse my brevity.
>
> On January 8, 2018 2:29:40 AM PST, PIKAL Petr 
> wrote:
> >Hi
> >
> >With the example, na.locf seems to be the easiest way.
> >> library(zoo)
> >
> >> na.locf(df1)
> >  ID ID_2 Firist Value
> >1  a   aa   TRUE 2
> >2  a   ab  FALSE 2
> >3  a   ac  FALSE 2
> >4  b   aa   TRUE 5
> >5  b   ab  FALSE 5
> >
> >Cheers
> >Petr
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jeff
> >> Newmiller
> >> Sent: Monday, January 8, 2018 9:13 AM
> >> To: r-help@r-project.org; Ek Esawi 
> >> Subject: Re: [R] Replace NAs in split lists
> >>
> >> Upon closer examination I see that you are not using the split
> >version of
> >> df1 as I usually would, so here is a reproducible example:
> >>
> >> #
> >> df1 <- read.table( text=
> >> "ID ID_2 Firist Value
> >> 1  a   aa   TRUE 2
> >> 2  a   ab  FALSENA
> >> 3  a   ac  FALSENA
> >> 4  b   aa   TRUE 5
> >> 5  b   ab  FALSENA
> >> ", header=TRUE, as.is=TRUE )
> >>
> >> sdf <- split( df1, df1$ID )
> >> # note the extra [ 1 ] in case you have more than one non-NA value #
> >per ID
> >> sdf2 <- lapply( sdf
> >>, function( z ) {
> >>   z$Value <- ifelse( is.na( z$Value )
> >>, z$Value[ !is.na( z$Value ) ][ 1
> >]
> >>, z$Value
> >>)
> >>   z
> >>  }
> >>)
> >> df2 <- do.call( rbind, sdf2 )
> >> df2
> >> #> ID ID_2 Firist Value
> >> #> a.1  a   aa   TRUE 2
> >> #> a.2  a   ab  FALSE 2
> >> #> a.3  a   ac  FALSE 2
> >> #> b.4  b   aa   TRUE 5
> >> #> b.5  b   ab  FALSE 5
> >>
> >> # or using tidyverse methods
> >>
> >> library(dplyr)
> >> #>
> >> #> Attaching package: 'dplyr'
> >> #> The following objects are masked from 'package:stats':
> >> #>
> >> #> filter, lag
> >> #> The following objects are masked from 'package:base':
> >> #>
> >> #> intersect, setdiff, setequal, union
> >> df3 <- (   df1
> >> %>% group_by( ID )
> >> %>% do({
> >>mutate( .
> >>  , Value = ifelse( is.na( Value )
> >>  , Value[ !is.na( Value ) ][ 1 ]
> >>  , Value
> >>  )
> >>  )
> >> })
> >> %>% ungroup
> >> )
> >> df3
> >> #> # A tibble: 5 x 4
> >> #>   IDID_2  Firist Value
> >> #>   
> >> #> 1 a aaT  2
> >> #> 2 a abF  2
> >> #> 3 a acF  2
> >> #> 4 b aaT  5
> >> #> 5 b abF  5
> >> #
> >>
> >> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
> >>
> >> > Why do you want to modify df1?
> >> >
> >> > Why not just reassemble the parts as a new data frame and use that
> >> > going forward in your calculations? That is generally the preferred
> >> > approach in R so you can re-do your calculations easily if you find
> >a
> >> > mistake later.
> >> > --
> >> > Sent from my phone. Please excuse my brevity.
> >> >
> >> > On January 7, 2018 7:35:59 PM PST, Ek Esawi 
> >wrote:
> >> >> I just came up with a solution right after i posted the question,
> >but
> >> >> i figured there must be a better and shorter one.than my solution
> >> >> sdf1[[1]][1,4]<-lapplyresults[[1]]
> >> >> sdf1[[2]][1,4]<-lapplyresults[[2]]
> >> >>
> >> >> EK
> >> >>
> >> >> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi 
> >wrote:
> >> >>> Hi all--
> >> >>>
> >> >>> I stumbled on this problem online. I did not like the solution
> >given
> >> >>> there which was a long UDF. I thought why cannot split and l/s
> >apply
> >> >>> work here. My aim is to split the data frame, use l/sapply, make
> >> >>> changes on the split lists and combine the split lists to new
> >data
> >> >>> frame with the desired changes/output.
> >> >>>
> >> >>> The data frame shown below has a column named ID which has 2
> >> >> variables
> >> >>> a and b; i want to replace the NAs on the Value column by 2,
> >which
> >> >>> is the only numeric entry, for ID=a and by 5 for ID=b.
> >> >>>
> >> >>> I worked out the solution but could not replace the results in
> >the
> >> >> split lists.
> >> >>>
> >> >>> Original dataframe , df1
> >> >>>   ID ID_2 Firist Value
> >> >>> 1  a   aa   TRUE 2
> >> >>> 2  a   ab  FALSENA
> >> >>> 3  a   ac  FALSENA
> >> >>> 4  b   aa   TRUE 5
> >> >>> 5  b   ab  FALSENA
> >> >>> Sdf1
> >> >>> $a
> >> >>> ID ID_2 Firist Value
> >> >>> 1  a   aa   TRUE 2
> >> >>> 2  a

Re: [R] application of R

2018-01-11 Thread Eric Berger

Marc and Jeff give excellent advice. Since you have a commercial
perspective, here are two more points to consider:
1. There are companies that sell software built on R. For example, the
company Rstudio.com develops both free and "professional" versions of its
products RStudio and Shiny.
2. You ask about selling software. Switch hats and think about buying
software. Some real-world problems can be solved using commercial products
such as Matlab (which costs thousands of dollars.) For some of these
problems, the world of R (and more generally CRAN - the Comprehensive R
Archive Network - https://cran.r-project.org/ - where you can find many of
the freely available R-packages) is a great alternative and it is free.

On Fri, Jan 12, 2018 at 6:40 AM, Jeff Newmiller 
wrote:

> Because many technical people need to accomplish statistical data analysis
> with computers that depend on existing algorithms applied in new ways, or
> with new algorithms that are not implemented by commercial software.  Often
> such people have no desire to provide step-by-step support of their tools
> for every user of their code indefinitely, so developing commercial
> software for others is less useful to them than having access to existing
> software that can be adapted. They often find that allowing others access
> to their code is a reasonable trade for being able to re-use the work of
> others before them.
>
> You might read the book "The Cathedral and the Bazaar" for more detail
> about this perspective, but this line of discussion is not really on topic
> here.
> --
> Sent from my phone. Please excuse my brevity.
>
> On January 11, 2018 7:09:20 PM PST, muhammad ramzi 
> wrote:
> >Thank you very much this really helped me a lot .
> >So actually why would people learn R(other than personal interests ) if
> >you can't really build anything that can be sold ? I'm sorry if I'm
> >asking bad questions
> >
> >
> >> On 12 Jan 2018, at 4:43 AM, Marc Schwartz 
> >wrote:
> >>
> >>
> >>
> >>> On Jan 11, 2018, at 2:15 PM, muhammad ramzi 
> >wrote:
> >>>
> >>> hello guys,
> >>>
> >>> i am a petroleum engineering student and i will be having a long
> >semester
> >>> break and currently i am learning THE R PROGRAMMING LANGUAGE just
> >out of
> >>> interest. I would just like to know if i am able to design a
> >business
> >>> analysis software using R as in create a type of software that can
> >be sold
> >>> to business people. can this be done in R language?
> >>>
> >>> another thing is if i do learn this all the way, what advantages
> >will it
> >>> give me in terms of future prospects and career development?
> >>
> >>
> >> Hi,
> >>
> >> To your first question, as R is open source and released under the
> >GPL, there are legal issues that you will need to consider, which will
> >be specific to the details of your plans, how your "application" is
> >built, how it interacts with R, and importantly, the copying and
> >distribution of the end product.
> >>
> >> You should, first and foremost, contact a lawyer familiar with open
> >source software, specifically GPL compatible licenses, so that you can
> >get proper legal advice, which you will not get here. You risk
> >legal/financial liabilities down the road if not done in compliance
> >with the license requirements.
> >>
> >> As a first pass, you should read:
> >>
> >>
> >https://cran.r-project.org/doc/FAQ/R-FAQ.html#Can-I-use-
> R-for-commercial-purposes_003f
> >>
> >> and
> >>
> >>  https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html
> >>
> >> so that you can gain initial insights into some of the general
> >implications of building a product for distribution (whether you give
> >it away or sell it) that depends upon a GPL licensed application.
> >>
> >> Whether or not there is utility for the application you envision such
> >that people would be willing to pay for it, will depend upon a variety
> >of factors, not the least of which is what competition you face and the
> >value of your planned application over others that are already in the
> >marketplace.
> >>
> >> To your second question, you are asking a biased, self selected
> >audience. Thus, take that into account for any responses that you may
> >get.
> >>
> >> The responses relative to advantages are going to be, to some extent,
> >broadly industry specific. That being said, in many domains, knowing R,
> >along with other relevant applications and programming languages can
> >only be beneficial in many cases.
> >>
> >> R is becoming increasingly popular (e.g. see:
> >https://www.tiobe.com/tiobe-index/). However, depending upon the
> >subject matter domain you will work in and to a large extent, the
> >company or institution you will work for, those factors can have a
> >material influence on the role that R might play in that environment.
> >>
> >> Others can perhaps chime in with other thoughts and perhaps even
> >industry specific insights for you.
> >>
> >> Regards,
> >>
> >> Marc Schwartz
> >>
> >
> >___

Re: [R] barplot that displays sums of values of 2 y colums grouped by different variables

2018-01-15 Thread Eric Berger

'position="dodge"' has no effect in the plot because the x-axis is a factor
variable. The bars do not need to be moved to avoid each other. The
'aes(fill=y)' is specifying that you want the color gradient to capture the
sums in the 'y' variable. You might be better off to use 'no' and 'yes'
rather than 'n' and 'y' to avoid confusion. Then you would see that the
statement would be 'aes(fill=yes)'. Summary: the height of each bar
represents the sum of the 'no' for that city, and the color of each bar
represents the sum of the 'yes' for that city. Your code is fine, unless
that is not what you were trying to do.

HTH,
Eric

On Mon, Jan 15, 2018 at 6:59 PM, kenneth dyson  wrote:

> I am trying to create a barplot displaying the sums of 2 columns of data
> grouped by a variable. the data is set up like this:
>
> "city" "n" "y" 
> mon 100 200 
> tor 209 300 
> edm 98 87 
> mon 20 76 
> tor 50 96 
> edm 62 27 
>
> the resulting plot should have city as the x-axis, 2 bars per city, 1
> representing the sum of "n" in that city, the other the sum of "y" in that
> city.
>
> If possible also show the sum in each bar as a label?
>
> I aggregated the data into sums like this:
>
> sum_data <- aggregate(. ~ City,data=raw_data,sum)
>
> this gave me the sums per city as I wanted but for some reason 1 of the
> cities is missing in the output.
>
> Using this code for the plot:
>
> ggplot(sum_data,aes(x = City,y = n)) + geom_bar(aes(fill = y),stat =
> "identity",position = "dodge")
>
> gave be a bar plot with one bar per city showing the sum of y as a color
> gradient. not what I expected given the "dodge" command in geom_bar.
>
> Thanks.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot that displays sums of values of 2 y colums grouped by different variables

2018-01-15 Thread Eric Berger

https://stackoverflow.com/questions/25070547/ggplot-side-by-side-geom-bar

On Mon, Jan 15, 2018 at 9:39 PM, Kenneth Dyson  wrote:

> Hi Eric,
>
> Thanks for the detailed response.
> This is not exactly what I want to do but is close.
> I want 2 bars for each city, 1 with the sum for "yes" , the other, beside
> it, with the sum for "no".
>
> I am way off track with my method here?
>
> Thanks,
> Ken
>
> Sent from Blue <http://www.bluemail.me/r?b=11745>
> On Jan 15, 2018, at 14:34, Eric Berger  wrote:
>>
>> 'position="dodge"' has no effect in the plot because the x-axis is a
>> factor variable. The bars do not need to be moved to avoid each other. The
>> 'aes(fill=y)' is specifying that you want the color gradient to capture the
>> sums in the 'y' variable. You might be better off to use 'no' and 'yes'
>> rather than 'n' and 'y' to avoid confusion. Then you would see that the
>> statement would be 'aes(fill=yes)'. Summary: the height of each bar
>> represents the sum of the 'no' for that city, and the color of each bar
>> represents the sum of the 'yes' for that city. Your code is fine, unless
>> that is not what you were trying to do.
>>
>> HTH,
>> Eric
>>
>>
>> On Mon, Jan 15, 2018 at 6:59 PM, kenneth dyson <
>> kenn...@kidscodejeunesse.org> wrote:
>>
>>> I am trying to create a barplot displaying the sums of 2 columns of data
>>> grouped by a variable. the data is set up like this:
>>>
>>> "city" "n" "y" 
>>> mon 100 200 
>>> tor 209 300 
>>> edm 98 87 
>>> mon 20 76 
>>> tor 50 96 
>>> edm 62 27 
>>>
>>> the resulting plot should have city as the x-axis, 2 bars per city, 1
>>> representing the sum of "n" in that city, the other the sum of "y" in that
>>> city.
>>>
>>> If possible also show the sum in each bar as a label?
>>>
>>> I aggregated the data into sums like this:
>>>
>>> sum_data <- aggregate(. ~ City,data=raw_data,sum)
>>>
>>> this gave me the sums per city as I wanted but for some reason 1 of the
>>> cities is missing in the output.
>>>
>>> Using this code for the plot:
>>>
>>> ggplot(sum_data,aes(x = City,y = n)) + geom_bar(aes(fill = y),stat =
>>> "identity",position = "dodge")
>>>
>>> gave be a bar plot with one bar per city showing the sum of y as a color
>>> gradient. not what I expected given the "dodge" command in geom_bar.
>>>
>>> Thanks.
>>>
>>> __ 
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Steps to create spatial plots

2018-01-15 Thread Eric Berger

If layer$z is a matrix and you want to reverse the order of the rows, you
can do:

n <- nrow(layer$z)
layer$z <- layer$z[ n:1, ]

HTH,
Eric


On Tue, Jan 16, 2018 at 8:43 AM, lily li  wrote:

> Sorry for the emails, I just wanted to have an example.
> layer$z
>
> 1  1  3  4  6  2
> 2  3  4  1  2  9
> 1  4  5  2  1  8
>
> How to convert the matrix to layer$z = c(1, 4, 5, 2, 1, 8, 2, 3, 4, 1, 2,
> 9, 1, 1, 3, 4, 6, 2)?
> I think this vector is the order that levelplot can use. Thanks again.
>
>
> On Mon, Jan 15, 2018 at 10:58 PM, lily li  wrote:
>
> > Hi Bert,
> >
> > I think you are correct that I can use levelplot, but I have a question
> > about converting data. For example, the statement:
> > levelplot(Z~X*Y), Z is row-wise from the lower left corner to the upper
> > right corner.
> > My dataset just have gridded Z data as a txt file (or can be called
> > matrix?), how to convert them to the vector in order for levelplot to
> use?
> > Thanks.
> >
> > On Mon, Jan 15, 2018 at 6:04 PM, Bert Gunter 
> > wrote:
> >
> >> From your description, I am **guessing** that you may not want a
> "spatial
> >> map" (including projections) at all, but rather something like a level
> >> plot. See ?levelplot in the lattice package for details. Both I am sure
> >> ggplot2 has something similar.
> >>
> >> Apologies if I havemisunderstood your intent/specifications.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Mon, Jan 15, 2018 at 4:54 PM, lily li  wrote:
> >>
> >>> Hi Roman,
> >>>
> >>> Thanks for your reply. For the spatial coordinates layer, I just have
> >>> coordinates of the upper left corner, numbers of rows and columns of
> the
> >>> spatial map, and grid cell size. How to create a spatial layer of
> >>> coordinates from this data? Thanks.
> >>>
> >>>
> >>> On Mon, Jan 15, 2018 at 3:26 PM, Roman Luštrik <
> roman.lust...@gmail.com>
> >>> wrote:
> >>>
> >>> > You will need to coerce your data into a "spatial" kind, as
> >>> implemented in
> >>> > `sp` or as of late, `sf` packages. You might want to give the
> >>> vignettes a
> >>> > whirl before you proceed.
> >>> > Roughly, you will have to coerce the data to Spatial* (you could go
> >>> for a
> >>> > point, raster or grid type, I think) and also specify the projection.
> >>> Once
> >>> > you have that, plotting should be handled by packages.
> >>> >
> >>> > Here are a few quick links that might come handy:
> >>> >
> >>> > https://cran.r-project.org/web/views/Spatial.html
> >>> > http://www.datacarpentry.org/R-spatial-raster-vector-
> >>> > lesson/10-vector-csv-to-shapefile-in-r/
> >>> >
> >>> >
> >>> > Cheers,
> >>> > Roman
> >>> >
> >>> > On Mon, Jan 15, 2018 at 11:22 PM, lily li 
> wrote:
> >>> >
> >>> >> Hi users,
> >>> >>
> >>> >> I have no clear clue about plotting spatial data. For example, I
> just
> >>> >> have a table with attribute values of each grid cell, such as
> >>> elevation.
> >>> >> Then I have coordinates of the upper left corner in UTM, the number
> >>> of rows
> >>> >> and columns, and grid cell size. How to create spatial plot of
> >>> elevations
> >>> >> for the grid cells, in color ramp? Should I create a spatial grid
> >>> layer
> >>> >> with all the polygons first? Thanks.
> >>> >>
> >>> >> --
> >>> >> --
> >>> >> You received this message because you are subscribed to the ggplot2
> >>> >> mailing list.
> >>> >> Please provide a reproducible example:
> https://github.com/hadley/devt
> >>> >> ools/wiki/Reproducibility
> >>> >>
> >>> >> To post: email ggpl...@googlegroups.com
> >>> >> To unsubscribe: email ggplot2+unsubscr...@googlegroups.com
> >>> >> More options: http://groups.google.com/group/ggplot2
> >>> >>
> >>> >> ---
> >>> >> You received this message because you are subscribed to the Google
> >>> Groups
> >>> >> "ggplot2" group.
> >>> >> To unsubscribe from this group and stop receiving emails from it,
> >>> send an
> >>> >> email to ggplot2+unsubscr...@googlegroups.com.
> >>> >> For more options, visit https://groups.google.com/d/optout.
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > In God we trust, all others bring data.
> >>> >
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posti
> >>> ng-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> post

[R] Split charts with ggplot2, tidyquant

2018-01-17 Thread Eric Berger

A very common chart in the financial markets is a split chart with two time
series shown in two vertically stacked sub-charts.
A classic case would be the top panel showing the time series of historical
prices of some stock, and the bottom
panel showing the volume traded per day immediately below it. The common
x-axis is the dates of the time period covered.

I would like to create such a standard plot using ggplot2. How does one do
it?
The goals of the tidyquant package would seem to include the easy creation
of such a chart, but I could not find this in tidyquant.

Suppose it were possible to easily create such a chart in ggplot2 (or
tidyquant which uses ggplot2.)
Then with such data for numerous stocks (or other financial instruments)
one could see a grid of such charts by faceting with respect to the stock.

Thanks for any help,

Eric

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] roxygen2 error - x$tag operator is invalid for atomic vectors

2018-01-17 Thread Eric Berger

This is an error message from R.
For example, if you give the following R commands
>  a <- 5
> a$foo
This will generate the error message:
Error in a$foo : $ operator is invalid for atomic vectors

So you can search for the string 'x$tag' in your code (or possibly in the
package).

HTH,
Eric

On Wed, Jan 17, 2018 at 3:16 PM, Martin Møller Skarbiniks Pedersen <
traxpla...@gmail.com> wrote:

> Hi,
>
>   I am trying to create my first R package.
>   I will later today put the files on Github.
>
>   However I gets this error and I can't find any reason for it:
>
> R> roxygen2::roxygenise()
> First time using roxygen2. Upgrading automatically...
> Error in x$tag : $ operator is invalid for atomic vectors
> R>
>
>   Any ideas?
>
> Regards
> Martin M. S. Pedersen
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading lisp file in R

2018-01-17 Thread Eric Berger

It seems the file contains records, with each record having 18 fields.
I would use awk (standard unix tool), creating an awk script to process the
file
into a new file with one line for each record, each line with 18 fields,
say comma-separated.
The csv file can then be easily read into R via the function read.csv.

HTH,
Eric

On Thu, Jan 18, 2018 at 6:22 AM, Ranjan Maitra  wrote:

> Dear friends,
>
> Is there a way to read data files written in lisp into R?
>
> Here is the file: https://archive.ics.uci.edu/
> ml/machine-learning-databases/university/university.data
>
> I would like to read it into R. Any suggestions?
>
> Thanks very much in advance for pointers on this and best wishes,
> Ranjan
>
> --
> Important Notice: This mailbox is ignored: e-mails are set to be deleted
> on receipt. Please respond to the mailing list if appropriate. For those
> needing to send personal or professional e-mail, please use appropriate
> addresses.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split charts with ggplot2, tidyquant

2018-01-18 Thread Eric Berger

Hi Charlie,
I am comfortable to put the data in any way that works best. Here are two
possibilities: an xts and a data frame.

library(quantmod)
quantmod::getSymbols("SPY")  # creates xts variable SPY
SPYxts <- SPY[,c("SPY.Close","SPY.Volume")]
SPYdf  <- data.frame(Date=index(SPYxts),close=as.numeric(SPYxts$SPY.Close),
 volume=as.numeric(SPYxts$SPY.Volume))
rownames(SPYdf) <- NULL

head(SPYxts)
head(SPYdf)

#   SPY.Close SPY.Volume
#2007-01-03141.37   94807600
#2007-01-04141.67   69620600
#2007-01-05140.54   76645300
#2007-01-08141.19   71655000
#2007-01-09141.07   75680100
#2007-01-10141.54   72428000

#Date  close   volume
#1 2007-01-03 141.37 94807600
#2 2007-01-04 141.67 69620600
#3 2007-01-05 140.54 76645300
#4 2007-01-08 141.19 71655000
#5 2007-01-09 141.07 75680100
#6 2007-01-10 141.54 72428000

Thanks,
Eric



On Thu, Jan 18, 2018 at 8:00 PM, Charlie Redmon  wrote:

> Could you provide some information on your data structure (e.g., are the
> two time series in separate columns in the data)? The solution is fairly
> straightforward once you have the data in the right structure. And I do not
> think tidyquant is necessary for what you want.
>
> Best,
> Charlie
>
> --
> Charles Redmon
> GRA, Center for Research Methods and Data Analysis
> PhD Student, Department of Linguistics
> University of Kansas
> Lawrence, KS, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split charts with ggplot2, tidyquant

2018-01-19 Thread Eric Berger

Hi Charlie,
Thanks. This is helpful. As mentioned in my original question, I want to be
able to plot a few such charts on the same page,
say a 2 x 2 grid with such a chart for each of 4 different stocks. Using
your solution I accomplished this by making
a list pLst of your ggplots and then calling cowplot::plot_grid(
plotlist=pLst, nrow=2, ncol=2 )  That worked fine.

The one issue  I have is that in the ggplot you suggest, the price and
volume facets are the same size. I would like them to be different sizes
(e.g. the volume facet at the bottom is generally shown smaller than the
facet above it in these types of charts.)

I tried to find out how to do it but didn't succeed. I found a couple of
relevant discussions (including Hadley writing that he did not think it was
a useful feature. :-()

https://github.com/tidyverse/ggplot2/issues/566

and an ancient one where someone seems to have been able to get a heights
parameter working in a call to facet_grid but it did not work for me.
https://kohske.wordpress.com/2010/12/25/adjusting-the-relative-space-of-a-facet-grid/

Thanks again,
Eric

p.s. Joshua thanks for your suggestions, but I was hoping for a ggplot
solution.

On Fri, Jan 19, 2018 at 6:33 PM, Charlie Redmon  wrote:

> So the general strategy for getting these into separate panels in ggplot
> is to have a single variable that will be your response and a factor
> variable that indexes which original variable it came from. This can be
> accomplished in many ways, but the way I use is with the melt() function in
> the reshape2 package.
> For example,
>
> library(reshape2)
> plotDF <- melt(SPYdf,
> id.vars="Date", # variables to replicate
> measure.vars=c("close", "volume"), # variables to
> create index from
> variable.name="parameter", # name of new variable
> for index
> value.name="resp") # name of what will be your
> response variable
>
> Now the ggplot2 code:
>
> library(ggplot2)
> ggplot(plotDF, aes(x=Date, y=resp)) +
> facet_wrap(~parameter, ncol=1, scales="free") +
> geom_line()
>
>
> Hope that does the trick!
>
> Charlie
>
>
>
> On 01/18/2018 02:11 PM, Eric Berger wrote:
>
>> Hi Charlie,
>> I am comfortable to put the data in any way that works best. Here are two
>> possibilities: an xts and a data frame.
>>
>> library(quantmod)
>> quantmod::getSymbols("SPY")  # creates xts variable SPY
>> SPYxts <- SPY[,c("SPY.Close","SPY.Volume")]
>> SPYdf  <- data.frame(Date=index(SPYxts),close=as.numeric(SPYxts$SPY.Cl
>> ose),
>>  volume=as.numeric(SPYxts$SPY.Volume))
>> rownames(SPYdf) <- NULL
>>
>> head(SPYxts)
>> head(SPYdf)
>>
>> #   SPY.Close SPY.Volume
>> #2007-01-03141.37   94807600
>> #2007-01-04141.67   69620600
>> #2007-01-05140.54   76645300
>> #2007-01-08141.19   71655000
>> #2007-01-09141.07   75680100
>> #2007-01-10141.54   72428000
>>
>> #Date  close   volume
>> #1 2007-01-03 141.37 94807600
>> #2 2007-01-04 141.67 69620600
>> #3 2007-01-05 140.54 76645300
>> #4 2007-01-08 141.19 71655000
>> #5 2007-01-09 141.07 75680100
>> #6 2007-01-10 141.54 72428000
>>
>> Thanks,
>> Eric
>>
>>
>>
>> On Thu, Jan 18, 2018 at 8:00 PM, Charlie Redmon > <mailto:redm...@gmail.com>> wrote:
>>
>> Could you provide some information on your data structure (e.g.,
>> are the two time series in separate columns in the data)? The
>> solution is fairly straightforward once you have the data in the
>> right structure. And I do not think tidyquant is necessary for
>> what you want.
>>
>> Best,
>> Charlie
>>
>> -- Charles Redmon
>> GRA, Center for Research Methods and Data Analysis
>> PhD Student, Department of Linguistics
>> University of Kansas
>> Lawrence, KS, USA
>>
>>
>>
> --
> Charles Redmon
> GRA, Center for Research Methods and Data Analysis
> PhD Student, Department of Linguistics
> University of Kansas
> Lawrence, KS, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split charts with ggplot2, tidyquant

2018-01-21 Thread Eric Berger

Hi Charlie and Bert,
Thank you both for the suggestions and pointers. I will look into them.

FYI I repeatedly refer to tidyquant because that package refers to itself as
"tidyquant: Tidy Quantitative Financial Analysis" and I am hoping to get the
attention of someone who is involved in the tidyquant package. The type of split
chart I am interested in is standard / prevalent in financial charting,
e.g. the charts on https://www.bloomberg.com/markets/stocks all have
an 'Indicators' button which allows you add, say, a volume chart as a
subchart below the main part of the chart.

Thanks again,
Eric



On Sun, Jan 21, 2018 at 6:45 PM, Charlie Redmon  wrote:
> Thanks for the reminder about lattice! I did some searching and there's a
> good example of manipulating the size of subplots using the `position`
> argument (see pp. 202-203 in the Trellis Users Guide:
> http://ml.stat.purdue.edu/stat695t/writings/Trellis.User.pdf). This is not
> within the paneling environment with the headers like in other trellis plots
> though, so you'll have to do a bit more digging to see how to get that to
> work if you need those headers.
>
>
> Best,
>
> Charlie
>
>
> On 01/20/2018 03:17 PM, Bert Gunter wrote:
>>
>> That (the need for base graphics) is false. It certainly **can** be done
>> in base graphics -- see ?layout for a perhaps more straightforward way to do
>> it along the lines you suggest.
>>
>> However both lattice and ggplot are based on grid graphics, which has a
>> similar but slightly more flexible ?grid.layout function which would allow
>> one to size and place subsequent ggplot or lattice graphs in an arbitrary
>> layout as you have described (iiuc) for the base graphics case.
>>
>> Perhaps even simpler would be to use the "position" argument of the
>> print.trellis() function to locate trellis plots. Maybe ggplot() has
>> something similar.
>>
>> In any case, the underlying grid graphics functionality allows **much**
>> greater fine control of graphical elements (including rotation, for example)
>> -- at the cost of greater complexity. I would agree that doing it from
>> scratch using base grid functions is most likely overkill here, though. But
>> it's there.
>>
>> IMHO only, the base graphics system was great in its time, but its time
>> has passed. Grid graphics is much more powerful because it is objects based
>> -- that is, grid graphs are objects that can be saved, modified, and even
>> interacted with in flexible ways. Lattice and ggplot incarnations take
>> advantage of this, giving them more power and flexibility than the base
>> graphics capabilities can muster.
>>
>> I repeat -- IMHO only! Feel free to disagree. I don't want to start any
>> flame wars here.
>>
>> Cheers,
>> Bert
>>
>>
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Sat, Jan 20, 2018 at 12:19 PM, Charlie Redmon > <mailto:redm...@gmail.com>> wrote:
>>
>>     For this kind of control you will probably need to move to base
>> graphics
>> and utilize the `fig` argument in par(), in which case you would
>> want to
>> run the plot() command twice: once with your first outcome and
>> once with
>> your second, changing the par() settings before each one to
>> control the
>> size.
>>
>>
>> On 01/19/2018 01:39 PM, Eric Berger wrote:
>> > Hi Charlie,
>> > Thanks. This is helpful. As mentioned in my original question, I
>> want
>> > to be able to plot a few such charts on the same page,
>> > say a 2 x 2 grid with such a chart for each of 4 different stocks.
>> > Using your solution I accomplished this by making
>> > a list pLst of your ggplots and then calling cowplot::plot_grid(
>> > plotlist=pLst, nrow=2, ncol=2 )  That worked fine.
>> >
>> > The one issue  I have is that in the ggplot you suggest, the
>> price and
>> > volume facets are the same size. I would like them to be
>> different sizes
>> > (e.g. the volume facet at the bottom is generally shown smaller than
>> > the facet above it in these types of charts.)
>> >
>> > I tried to find out how to do it but didn't succeed. I found a
>> couple
>> > of relevant discussions (inc

Re: [R] Newbie - Scrape Data From PDFs?

2018-01-23 Thread Eric Berger

Hi Scott,
I have never done this myself but I read something recently on the
r-help distribution that was related.
I just did a quick search and found a few hits that might work for you.

1. 
https://medium.com/@CharlesBordet/how-to-extract-and-clean-data-from-pdf-files-in-r-da11964e252e
2. http://bxhorn.com/2016/extract-data-tables-from-pdf-files-in-r/
3. 
https://www.rdocumentation.org/packages/textreadr/versions/0.7.0/topics/read_pdf

HTH,
Eric

On Wed, Jan 24, 2018 at 3:58 AM, Scott Clausen  wrote:
> Hello,
>
> I’m new to R and am using it with RStudio to learn the language. I’m doing so 
> as I have quite a lot of traffic data I would like to explore. My problem is 
> that all the data is located on a number of PDFs. Can someone point me to 
> info on gathering data from other sources? I’ve been to the R FAQ and didn’t 
> see anything and would appreciate your thoughts.
>
>  I am quite sure now that often, very often, in matters concerning religion 
> and politics a man's reasoning powers are not above the monkey's.
>
> -- Mark Twain
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie wants to compare 2 huge RDSs row by row.

2018-01-26 Thread Eric Berger

Hi Marsh,
An RDS is not a data structure such as a data.frame. It can be anything.
For example if I want to save my objects a, b, c I could do:
> saveRDS( list(a,b,c,), file="tmp.RDS")
Then read them back later with
> myList <- readRDS( "tmp.RDS" )

Do you have additional information about your "RDSs" ?

Eric

On Sat, Jan 27, 2018 at 6:54 AM, Marsh Hardy ARA/RISK 
wrote:

> Each RDS is 40 MBs. What's a slick code to compare them row by row, IDing
> row numbers with mismatches?
>
> Thanks in advance.
>
> //
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie wants to compare 2 huge RDSs row by row.

2018-01-28 Thread Eric Berger

Hi Henrik,
Thanks for pointing out the diffobj package and the clear example. Nice!


On Sun, Jan 28, 2018 at 6:22 PM, Marsh Hardy ARA/RISK 
wrote:

> Thanks, I think I've found the most succinct expression of differences in
> two data.frames...
>
> length(which( rowSums( x1 != x2 ) > 0))
>
> gives a count of the # of records in two data.frames that do not match.
>
> //
> 
> From: Henrik Bengtsson [henrik.bengts...@gmail.com]
> Sent: Sunday, January 28, 2018 11:12 AM
> To: Ulrik Stervbo
> Cc: Marsh Hardy ARA/RISK; r-help@r-project.org
> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
>
> The diffobj package (https://cran.r-project.org/package=diffobj) is
> really helpful here.  It provides "diff" functions diffPrint(),
> diffStr(), and diffChr() to compare two object 'x' and 'y' and provide
> neat colorized summary output.
>
> Example:
>
> > iris2 <- iris
> > iris2[122:125,4] <- iris2[122:125,4] + 0.1
>
> > diffobj::diffPrint(iris2, iris)
> < iris2
> > iris
> @@ 121,8 / 121,8 @@
> ~ Sepal.Length Sepal.Width Petal.Length Petal.WidthSpecies
>   120  6.0 2.2  5.0 1.5  virginica
>   121  6.9 3.2  5.7 2.3  virginica
> < 122  5.6 2.8  4.9 2.1  virginica
> > 122  5.6 2.8  4.9 2.0  virginica
> < 123  7.7 2.8  6.7 2.1  virginica
> > 123  7.7 2.8  6.7 2.0  virginica
> < 124  6.3 2.7  4.9 1.9  virginica
> > 124  6.3 2.7  4.9 1.8  virginica
> < 125  6.7 3.3  5.7 2.2  virginica
> > 125  6.7 3.3  5.7 2.1  virginica
>   126  7.2 3.2  6.0 1.8  virginica
>   127  6.2 2.8  4.8 1.8  virginica
>
> What's not show here is that the colored output (supported by many
> terminals these days) also highlights exactly which elements in those
> rows differ.
>
> /Henrik
>
> On Sun, Jan 28, 2018 at 12:17 AM, Ulrik Stervbo 
> wrote:
> > The anti_join from the package dplyr might also be handy.
> >
> > install.package("dplyr")
> > library(dplyr)
> > anti_join (x1, x2)
> >
> > You can get help on the different functions by ?function.name(), so
> > ?anti_join() will bring you help - and examples - on the anti_join
> > function.
> >
> > It might be worth testing your approach on a small subset of the data.
> That
> > makes it easier for you to follow what happens and evaluate the outcome.
> >
> > HTH
> > Ulrik
> >
> > Marsh Hardy ARA/RISK  schrieb am So., 28. Jan. 2018,
> 04:14:
> >
> >> Cool, looks like that'd do it, almost as if converting an entire record
> to
> >> a character string and comparing strings.
> >>
> >> 
> >> From: William Dunlap [wdun...@tibco.com]
> >> Sent: Saturday, January 27, 2018 4:57 PM
> >> To: Marsh Hardy ARA/RISK
> >> Cc: Ulrik Stervbo; Eric Berger; r-help@r-project.org
> >> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
> >>
> >> If your two objects have class "data.frame" (look at class(objectName))
> >> and they
> >> both have the same number of columns and the same order of columns and
> the
> >> column types match closely enough (use all.equal(x1, x2) for that), then
> >> you can try
> >>  which( rowSums( x1 != x2 ) > 0)
> >> E.g.,
> >> > x1 <- data.frame(X=1:5, Y=rep(c("A","B"),c(3,2)))
> >> > x2 <- data.frame(X=c(1,2,-3,-4,5), Y=rep(c("A","B"),c(2,3)))
> >> > x1
> >>   X Y
> >> 1 1 A
> >> 2 2 A
> >> 3 3 A
> >> 4 4 B
> >> 5 5 B
> >> > x2
> >>X Y
> >> 1  1 A
> >> 2  2 A
> >> 3 -3 B
> >> 4 -4 B
> >> 5  5 B
> >> > which( rowSums( x1 != x2 ) > 0)
> >> [1] 3 4
> >>
> >> If you want to allow small numeric differences but exactly character
> >> matches
> >> you will have to get a bit fancier.  Splitting the data.frames into
> >> character and
> >> numeric parts and comparing each works well.
> >>
> >> Bill Dunlap
> >> TIBCO Software
> >> wdunlap tibco.com<http://t

Re: [R] Result show the values of fitting gamma parameter

2018-01-29 Thread Eric Berger

Capture the results of the apply command into an object and then work with
that. Here is one way to do it:

> res <- apply(C, 2, fitdist, "gamma")
> out <- c( res$A$estimate["shape"], res$B$estimate["shape"],
res$A$estimate["rate"], res$B$estimate["rate"])
> names(out) <- c("A shape","B shape","A rate","B Rate")
> print(out)

#   A shape   B shapeA rateB Rate
# 3.702253 31.300800  1.234126  3.912649

HTH,
Eric


On Mon, Jan 29, 2018 at 10:25 AM, smart hendsome via R-help <
r-help@r-project.org> wrote:

> Hi,
> Let say I have data by two columns A and B, and I have fit each column
> using the gamma distribution by 'fitdist' . I just want the result show
> only the shape and rate only.
>
> Eg:
> library(fitdistrplus)
>
> A <-c(1,2,3,4,5)
>
> B<-c(6,7,8,9,10)
>
> C <-cbind(A,B)
> apply(C, 2, fitdist, "gamma")
> Output show like this:
> $A
> Fitting of the distribution ' gamma ' by maximum likelihood
> Parameters:
>   estimate Std. Error
> shape 3.702253  2.2440052
> rate  1.234126  0.8011369
>
> $B
> Fitting of the distribution ' gamma ' by maximum likelihood
> Parameters:
>estimate Std. Error
> shape 31.300800   19.69176
> rate   3.9126492.48129
>
> I want the output to be like this:
>  AB
>  shape 3.702253  31.300800rate  1.234126  3.912649
> Can anyone solve my problem? Many thanks.
>
> Regards,
> Zuhri
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation based on runif to get mean

2018-01-30 Thread Eric Berger

Or a shorter version of Rui's approach:

set.seed(2511)# Make the results reproducible
fun <- function(n){
  f <- function(){
c(mean(runif(5,1,10)),mean(runif(5,10,20)))
  }
  replicate(n, f())
}
fun(10)

On Tue, Jan 30, 2018 at 12:03 PM, Rui Barradas  wrote:

> Hello,
>
> Another way would be to use ?replicate and ?colMeans.
>
>
> set.seed(2511)# Make the results reproducible
>
> fun <- function(n){
> f <- function(){
> a <- runif(5, 1, 10)
> b <- runif(5, 10, 20)
> colMeans(cbind(a, b))
> }
> replicate(n, f())
> }
>
> fun(10)
>
> Hope this helps,
>
> Rui Barradas
>
>
> On 1/30/2018 8:58 AM, Daniel Nordlund wrote:
>
>> On 1/29/2018 9:03 PM, smart hendsome via R-help wrote:
>>
>>> Hello everyone,
>>> I have a question regarding simulating based on runif.  Let say I have
>>> generated matrix A and B based on runif. Then I find mean for each matrix A
>>> and matrix B.  I want this process to be done let say 10 times. Anyone can
>>> help me.  Actually I want make the function that I can play around with the
>>> number of simulation process that I want. Thanks.
>>> Eg:
>>> a <- matrix(runif(5,1, 10))
>>>
>>> b <- matrix(runif(5,10, 20))
>>>
>>> c <- cbind(a,b); c
>>>
>>> mn <- apply(c,2,mean); mn
>>>
>>> Regards,
>>> Zuhri
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> Here is a straight forward implementation of your code in a function with
>> a parameter for the number simulations you want to run.
>>
>> sim <- function(n){
>>mn <- matrix(0,n, 2)
>>for(i in 1:n) {
>>  a <- runif(5,1, 10)
>>  b <- runif(5,10, 20)
>>  c <- cbind(a,b)
>>  mn[i,] <- apply(c, 2, mean)
>>  }
>>return(mn)
>>}
>> # run 10 iterations
>> sim(10)
>>
>> In your case, there doesn't seem to be a need to create a and b as
>> matrices; vectors work just as well.  Also, several of the statements could
>> be combined into one.  Whether this meets your needs depends on what your
>> real world task actually is.
>>
>>
>> Hope this is helpful,
>>
>> Dan
>>
>>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating angle of a polyline

2018-01-30 Thread Eric Berger

Assuming your polyline is defined by two vectors, one for the x
coordinates, one for the y coordinates, you can try the following

library(NISTunits)
polyangles <- function(xV,yV) {
  stopifnot( (length(xV)==length(yV)) && (length(xV) >= 3))
  v <- function(i) { c( xV[i]-xV[i-1], yV[i]-yV[i-1])}
  vlen <- function(v) { sqrt(sum(v*v)) }

  lV <- rep(NA_real_,length(xV))
  for ( i in 2:(length(xV)-1) )
lV[i] <- acos( sum(v(i)*v(i+1))/(vlen(v(i))*vlen(v(i+1))) )
  angleV <- NISTunits::NISTradianTOdeg(lV)
  angleV
}

# example
x <- c(0:3)
y <- c(0,0,1,1)
polyangles( x, y )

# NA 45.0 45.0 NA

Note, I have included the NA's at the beginning and end of the polyline as
a reminder that there is no angle defined there.

HTH,
Eric


On Tue, Jan 30, 2018 at 4:34 PM, Jeff Newmiller 
wrote:

> A polyline by definition has many angles, so your question is ill-formed.
> And this is a question about math, not R, so is off topic here. I suggest
> reading Wikipedia.
> --
> Sent from my phone. Please excuse my brevity.
>
> On January 29, 2018 11:10:02 PM PST, javad bayat 
> wrote:
> >Dear R users
> >I am trying to find a formula to calculate the angle of a polyline. Is
> >there a way to do this?
> >Many thanks.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating angle of a polyline

2018-01-30 Thread Eric Berger

nice

On Tue, Jan 30, 2018 at 7:05 PM, William Dunlap  wrote:

> I like to use complex numbers for 2-dimensional geometry.  E.g.,
>
> > polyAngles2
> function (xV, yV)
> {
> stopifnot((length(xV) == length(yV)) && (length(xV) >= 3))
> z <- complex(re = xV, im = yV)
> c(NA, diff(Arg(diff(z))), NA) # radians, positive is counter-clockwise
> }
> > x <- c(0:3)
> > y <- c(0,0,1,1)
> > polyAngles2(x,y) / pi * 180
> [1]  NA  45 -45  NA
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Jan 30, 2018 at 7:09 AM, Eric Berger 
> wrote:
>
>> Assuming your polyline is defined by two vectors, one for the x
>> coordinates, one for the y coordinates, you can try the following
>>
>> library(NISTunits)
>> polyangles <- function(xV,yV) {
>>   stopifnot( (length(xV)==length(yV)) && (length(xV) >= 3))
>>   v <- function(i) { c( xV[i]-xV[i-1], yV[i]-yV[i-1])}
>>   vlen <- function(v) { sqrt(sum(v*v)) }
>>
>>   lV <- rep(NA_real_,length(xV))
>>   for ( i in 2:(length(xV)-1) )
>> lV[i] <- acos( sum(v(i)*v(i+1))/(vlen(v(i))*vlen(v(i+1))) )
>>   angleV <- NISTunits::NISTradianTOdeg(lV)
>>   angleV
>> }
>>
>> # example
>> x <- c(0:3)
>> y <- c(0,0,1,1)
>> polyangles( x, y )
>>
>> # NA 45.0 45.0 NA
>>
>> Note, I have included the NA's at the beginning and end of the polyline as
>> a reminder that there is no angle defined there.
>>
>> HTH,
>> Eric
>>
>>
>> On Tue, Jan 30, 2018 at 4:34 PM, Jeff Newmiller > >
>> wrote:
>>
>> > A polyline by definition has many angles, so your question is
>> ill-formed.
>> > And this is a question about math, not R, so is off topic here. I
>> suggest
>> > reading Wikipedia.
>> > --
>> > Sent from my phone. Please excuse my brevity.
>> >
>> > On January 29, 2018 11:10:02 PM PST, javad bayat 
>> > wrote:
>> > >Dear R users
>> > >I am trying to find a formula to calculate the angle of a polyline. Is
>> > >there a way to do this?
>> > >Many thanks.
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> > posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] copy/paste of large amount of code to terminal leads to scrambled/missing characters

2018-02-03 Thread Eric Berger

Hi Martin,
Why not just do the following?
In your editor after you create the script save it to a file, say "foo.R".
Then in your R session you give the command
> source("foo.R")

HTH,
Eric


On Sun, Feb 4, 2018 at 6:33 AM, Bert Gunter  wrote:

> Obvious suggestion: use a more capable IDE instead of Textmate2 with
> copy/paste.
>
> RStudio is very popular now, but there are many others . Search on e.g. "R
> IDE For MAC" to see some alternatives.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Sat, Feb 3, 2018 at 4:51 PM, Jeff Newmiller 
> wrote:
>
> > This sounds like a problem with your editor or the OS clipboard support
> > rather than R. You might get a response here, but R-sig-mac seems more
> > appropriate to me for such discussion.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On February 3, 2018 4:23:54 PM PST, Martin Batholdy via R-help <
> > r-help@r-project.org> wrote:
> > >Dear R-users,
> > >
> > >This question might not be restricted to R, but I hope that some might
> > >have experienced similar problems and could help me.
> > >
> > >When using R, I usually work with a text-editor (textmate2) in which I
> > >prepare the script.
> > >To execute code, I then copy and paste it to an R-session running in
> > >the terminal/shell (on Mac OS).
> > >
> > >Unfortunately, when pasting too much code into the terminal (e.g. 60
> > >lines), some characters are occasionally and randomly scrambled or
> > >missing.
> > >For example "col <- ifelse(..." turns into "col < col < cse(…".
> > >
> > >This happens very randomly, is difficult to predict, and while it only
> > >affects a hand full of characters in total, it leads to a lot of errors
> > >in the code execution along the way.
> > >Apparently, it has to do with the buffer size and paste-speed of the
> > >terminal.
> > >
> > >So far, I could not find any solution to the problem.
> > >
> > >Therefore, I wanted to ask;
> > >Do others here use a similar workflow (i.e. having a text-editor for
> > >coding and using copy/paste to the terminal for code execution) and
> > >encountered similar problems with big chunks of code in the clipboard?
> > >Are there any solutions for this problem, specifically for running R
> > >over the shell?
> > >
> > >Thank you very much!
> > >
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] TreeBUGS - subscript out of bounds

2018-02-21 Thread Eric Berger

Hi Max,
Here's an example that will generate that error. Maybe it will point you to
your problem.

# create a 2x2 matrix
>  m <- matrix(1:4,nrow=2)

# refer to column 3 - which does not exist
> m[,3]
# Error in m[, 3] : subscript out of bounds

HTH,
Eric




On Wed, Feb 21, 2018 at 4:35 PM, Max Hennig  wrote:

> Dear all,
>
> I've only (very) recently started to use R (so please be easy on me if I
> may omit to mention relevant information or have overlooked fairly basic
> steps to solving the problem, since I do not have a lot of experience)
> because I'm interested in multinomial processing tree modeling with the
> TreeBUGS package (Heck, Arnold & Arnold 2017 - TreeBUGS: An R package for
> hierarchical multinomial-processing-tree modeling).
> I have attempted to conduct a permutation test with my dataset, as
> described on pages 5-6 of the paper. I've brought the data in the necessary
> long format (participant case in column 1, stimulus index in column 2,
> observed response in column 3) and specified the tree structure as
> described in the paper, in my case four trees with two possible responses
> each.
> When specifying the test, I am given the message:
>
> Error in M[, tree] : subscript out of bounds
>
> Though I've found some information on this general error online, all of it
> applies to different tests and didn't help me to solve the problem.
> Has anyone of you encountered this error before, or has a suggestion for
> me?
>
> Best,
> Max
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] alternative for multiple if_else statements

2018-02-22 Thread Eric Berger

Hi,
1. I think the reason that the different ordering leads to different
results is because of the following:
date[ some condition is true ][1]
will give you an NA if there are no rows where 'some condition holds'.
In the code that 'works' you don't have such a situation, but in the
code that 'does not work' you presumably hit an NA before you get to the
result that you really want.
2. I am not a big fan of your "nested if" layout. I think you could rewrite
it more clearly - and without nesting - with something like

 > trialData$survey_year <- rep(NA_character_, nrow(trialData))
 > trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
 > trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
 > etc

HTH,
Eric

On Wed, Feb 21, 2018 at 10:33 PM, Kevin Wamae 
wrote:

> Hi, I am having trouble trying to figure out why if_else is behaving the
> way it is, it may be my code or the way the data is structured.
>
> Below is a snapshot of a database am working on and it represents a
> longitudinal survey of study participants in a trial with weekly follow up.
>
> The variable "survey_start" represents the start of the study-defined one
> year follow up (which we called "survey_year").
>
> I am trying to populate all subsequent entries for each participant, per
> survey year, with the entry "survey" followed by an underscore and the
> respective year, eg. survey_2014.
>
> There are missing entries such as the participant represented here, wasn't
> available at the start of the 2015 survey. Also, some participants don’t
> have complete one-year follow ups but I still need to include them.
>
> I have written two codes, first one fails while the second works, the only
> difference being I have reversed the order in which the entries are
> populated in the second code (from 2007-2016 to 2016-2007) and removed the
> if_else statement for 2015. Also noticed, that for the second code, which
> spans the years 2007-2016 (less 2015), if a participants entries start from
> 2010-2016, the code fails.
>
> Kindly assist in figuring this out...or better yet, an alternative.
>
> trialData <- structure(list(study = c("site_1", "site_1", "site_1",
> "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
> "site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
> "child_1", "child_1"), date = structure(c(16078, 16085, 16092,
> 16098, 16104, 16115, 16121, 16129, 16135, 16140, 16

Re: [R] alternative for multiple if_else statements

2018-02-22 Thread Eric Berger

Hi Kevin,
I ran the code on the full data set and was able to reproduce the problem
that you are facing.
My guess is that you have an error in your intuition and/or logic, and that
this relates to the use of the subscript [1].
Specifically, on the full dataset, the condition
trialData$date[trialData$survey_start == "Y" & trialData$year == 2013 &
trialData$site == "site_1"]

yields 412 matches, of which there are 9 unique ones, specifically

April 2,3,4,5,8,10,11,16,17

In the full data set the first element that appears, i.e. subscript[1], is
"2013-04-04".

In the filtered data set the first element that appears is "2013-04-05".

I hope that is enough information for you to make further progress from
here.

Best,
Eric



On Thu, Feb 22, 2018 at 1:28 PM, Kevin Wamae 
wrote:

> Dear Eric, wow, this seems to do the trick. But I have encountered a
> problem.
>
>
>
> I have tested it on the larger dataset and it seems to work on a filtered
> dataset but not on the whole dataset (attached). See below script..
>
>
>
> #load packages
>
> Library(dplyr)
>
>
>
> #load data
>
> trialData <- fread("trialData.txt") %>% mutate(date =
> as.Date(date,"%d/%m/%Y"))
>
>
>
> #create blank variable
>
> trialData$survey_year <- rep(NA_character_, nrow(trialData))
>
>
>
> *#attempt 1 fails: code for survey*
>
> trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start
> == "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] &
> trialData$date < trialData$date[trialData$month == 4 & trialData$year ==
> 2014 & trialData$site == "site_1"][1]] <- "survey_2013"
>
>
>
> #filter trialData
>
> trialData <- trialData %>% filter(id == "id_786/3")
>
>
>
> *#attempt 2 works: code for survey*
>
> trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start
> == "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] &
> trialData$date < trialData$date[trialData$month == 4 & trialData$year ==
> 2014 & trialData$site == "site_1"][1]] <- "survey_2013"
>
>
>
>
>
>
>
> *From: *Eric Berger 
> *Date: *Thursday, 22 February 2018 at 13:05
> *To: *Kevin Wamae 
> *Cc: *"R-help@r-project.org" 
> *Subject: *Re: [R] alternative for multiple if_else statements
>
>
>
> Hi,
>
> 1. I think the reason that the different ordering leads to different
> results is because of the following:
>
> date[ some condition is true ][1]
>
> will give you an NA if there are no rows where 'some condition holds'.
>
> In the code that 'works' you don't have such a situation, but in the
> code that 'does not work' you presumably hit an NA before you get to the
> result that you really want.
>
> 2. I am not a big fan of your "nested if" layout. I think you could
> rewrite it more clearly - and without nesting - with something like
>
>
>
>  > trialData$survey_year <- rep(NA_character_, nrow(trialData))
>
>  > trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
>
>  > trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
>
>  > etc
>
>
>
> HTH,
>
> Eric
>
>
>
> On Wed, Feb 21, 2018 at 10:33 PM, Kevin Wamae 
> wrote:
>
> Hi, I am having trouble trying to figure out why if_else is behaving the
> way it is, it may be my code or the way the data is structured.
>
> Below is a snapshot of a database am working on and it represents a
> longitudinal survey of study participants in a trial with weekly follow up.
>
> The variable "survey_start" represents the start of the study-defined one
> year follow up (which we called "survey_year").
>
> I am trying to populate all subsequent entries for each participant, per
> survey year, with the entry "survey" followed by an underscore and the
> respective year, eg. survey_2014.
>
> There are missing entries such as the participant represented here, wasn't
> available at the start of the 2015 survey. Also, some participants don’t
> have complete one-year follow ups but I still need to include them.
>
> I have written two codes, first one fails while the second works, the only
> difference being I have reversed the order in which the entries are
> populated in the second code (from 2007-2016 to 2016-2007) and removed the
> if_else statement for 2015. Also noticed, that for the second code, which
> spans the years 2007-2

Re: [R] reshaping column items into rows per unique ID

2018-02-25 Thread Eric Berger

Hi Allaisone,
I took a slightly different approach but you might find this either as or
more useful than your approach, or at least a start on the path to a
solution you need.

df1   <-
data.frame(CustId=c(1,1,1,2,3,3,4,4,4),DietType=c("a","c","b","f","a","j","c","c","f"),
stringsAsFactors=FALSE)
custs <- unique(df1$CustId)
dtype <- unique(df1$DietType)
nc<- length(custs)
nd<- length(dtype)
df2   <- as.data.frame( matrix(rep(0,nc*(nd+1)),nrow=nc),
stringsAsFactors=FALSE)
colnames(df2) <- c("CustId",dtype[order(dtype)])
df2$CustId <- custs[ order(custs) ]

for ( i in 1:nrow(df1) ) {
  iRow <- match(df1$CustId[i],df2$CustId)
  iCol <- match(df1$DietType[i],colnames(df2))
  df2[ iRow, iCol ] <- df2[ iRow, iCol] + 1
}

> df2
#   CustId   a  b  c  f   j
# 1 1  1  1  1  0  0
# 2  2  0 0  0  0  0
# 3  3  1 0  0  0  1
# 4  4  0 0  2  1  0

The dataframe df2 will have a column for the CustId and one column for each
unique diet type.
Each row is a unique customerId, and each entry contains the number of
times the given diet type occurred for that customer.

I hope that helps,
Eric



On Sun, Feb 25, 2018 at 7:08 PM, Bert Gunter  wrote:

> I believe you need to spend time with an R tutorial or two: a data frame
> (presumably the "table" data structure you describe) can *not* contain
> "blanks" -- all columns must be the same length, which means NA's are
> filled in as needed.
>
> Also, 8e^5 * 7e^4 = 5.6e^10, which almost certainly will not fit into any
> local version of R (maybe it would in some server version -- others more
> knowledgeable should comment on this).
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Sun, Feb 25, 2018 at 4:59 AM, Allaisone 1 
> wrote:
>
> > Hi All
> >
> > I have a datafram which looks like this :
> >
> > CustomerIDDietType
> > 1   a
> > 1c
> > 1b
> > 2f
> > 2 a
> > 3 j
> > 4 c
> > 4 c
> > 4  f
> >
> > And I would like to reshape this so I can see the list of DietTypes per
> > customer in rows instead of columns like this :
> >
> > > MyDf
> > CustomerID  DietType   DietType  DietType
> > 1ac   b
> > 2 f a
> > 3 j
> > 4 c  c f
> >
> > I tried many times using melt(),spread (),and dcast () functions but was
> > not able to produce the desired table. The best attempt was by typing :
> >
> > # 1) Adding new column with unique values:
> > MyDf $newcol <- c (1:9)
> > #2) then :
> > NewDf <- dcast (MyDf,CustomerID~newcol,value.var=DietType)
> >
> > This produces the desired table but with many NA values like this :
> >
> > CustomerID1   2   34 56 7   8   9
> > 1a  cb   NA NA NA NA NA NA
> > 2  NA NA NA  f a  NA NA NA NA
> > 3  NA NA NA NA NA  j   NA NA NA
> > 4  NA NA NA NA NA NA c c f
> >
> >   As you see, the lette/s indicating DietType move to the right side each
> > time we move down leaving many NA values and as my original files is very
> > large, I expect that the final output would contain around 800,000
> columns
> > and 70,000 rows. This is why my code works with small data but does not
> > work with my large file because of memory issue even though I'm using
> large
> > PC.
> >
> > What changes I need to do with my code to produce the desired table where
> > the list of DietTypes are grouped in rows exactly like the second table
> > shown abover?
> >
> > Regards
> > Allaisnoe
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mai

Re: [R] Repeated use of dyn.load().

2018-03-01 Thread Eric Berger

Good question Rolf.
Rui, thanks for pointing out dyn.unload.
When I started using Rcpp a couple of years ago I got burned by stale .so
enough times that I adopted a policy of recompile-then-start new R session.
My workflow does not include Rolf's "brazillion" repeats, so the overhead
of this approach has not been too painful.
The documentation for dyn.unload (via ?dyn.unload) includes the following
statement:

"The function dyn.unload unlinks the DLL. Note that unloading a DLL and
then re-loading a DLL of the same name may or may not work: on Solaris it
uses the first version loaded."

Eric

On Thu, Mar 1, 2018 at 2:21 PM, Rui Barradas  wrote:

> Hello,
>
> In such cases, with C code, I call dyn.unload before loading the modified
> shared lib again.
> I don't know if this changed recently, but it used to be needed or else R
> wouldn't load the new lib. When I call dyn.unload followed by dyn.load I
> never had problems.
> (Or the other way around, call dyn.unload before modifying the C code.)
>
> Hope this helps,
>
> Rui Barradas
>
> On 3/1/2018 8:52 AM, Rolf Turner wrote:
>
>>
>> I am working with a function "foo" that explicitly dynamically loads a
>> shared object library or "DLL", doing something like dyn.load("bar.so").
>>  This is a debugging exercise so I make changes to the underlying Fortran
>> code (yes, I acknowledge that I am a dinosaur) remake the DLL "bar.so" and
>> then run foo again.  This is all *without* quitting and restarting R.  (I'm
>> going to have to do this a few brazillion times, and
>> I want the iterations to be as quick as possible.)
>>
>> This seems to work --- i.e. foo seems to obtain the latest version of
>> bar.so.  But have I just been lucky so far?  (I have not experimented
>> heavily).
>>
>> Am I running risks of leading myself down the garden path?  Are there
>> Traps for Young (or even Old) Players lurking about?
>>
>> I would appreciate Wise Counsel.
>>
>> cheers,
>>
>> Rolf Turner
>>
>>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variable centring within "predict.coxph"

2018-03-02 Thread Eric Berger

Hi Laura,
I will state up front that I have no experience or knowledge of the Cox
model or the survival package.
Out of curiosity I loaded the package and did ?coxph and found the
following comment in the documentation:

"The routine internally scales and centers data to avoid overflow in the
argument to the exponential function. These actions do not change the
result, but lead to more numerical stability."

This would seem to imply that for "good" cases, i.e. those which are
numerically stable, the two approaches (centered or not-centered) should
give the same results. Maybe you can find a test data set that is a "good"
case and compare the Stata and R results for that data to gain confidence
that you are calling the package correctly.

HTH,
Eric

On Fri, Mar 2, 2018 at 3:38 PM, Bonnett, Laura 
wrote:

> Dear R-help,
>
> I am using R-3.3.2 on Windows 10.  I teach on a course which has 4
> computer practical sessions related to the development and validation of
> clinical prediction models.  These are currently written for Stata and I am
> in the process of writing them for use in R too (as I far prefer R to
> Stata!)
>
> I notice that predictions made from a Cox model in Stata are based on
> un-centred variables, while they are based on centred variables in R.  I am
> aware that variable centring is the preferred approach to ensure sensible
> predictions from models and thus usually I am pleased that variable
> centring is automatically applied within coxph in R.  However, for the sake
> of producing identical results across the software packages, is there a way
> to produce predictions from a Cox model in R without variable centring?
>
> I am using the 'survival' package as follows (for example):
> library(survival)
> test1 <- list(time=c(4,3,1,1,2,2,3),
>   status=c(1,1,1,0,1,1,0),
>   x=c(0,2,1,1,1,0,0),
>   sex=c(0,0,0,0,1,1,1))
> mod1 <- coxph(Surv(time, status) ~ x + sex, test1)
> predict(mod1,type="lp")
>
> [This can of course be alternatively obtained from mod1$linear.predictor]
>
> Many thanks for your assistance.
>
> Kind regards,
> Laura
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lmrob gives NA coefficients

2018-03-04 Thread Eric Berger

What is 'd'? What is 'n'?


On Sun, Mar 4, 2018 at 12:14 PM, Christien Kerbert <
christienkerb...@gmail.com> wrote:

> Thanks for your reply.
>
> I use mvrnorm from the *MASS* package and lmrob from the *robustbase*
> package.
>
> To further explain my data generating process, the idea is as follows. The
> explanatory variables are generated my a multivariate normal distribution
> where the covariance matrix of the variables is defined by Sigma in my
> code, with ones on the diagonal and rho = 0.15 on the non-diagonal. Then y
> is created by y = 1 - 2*x1 + 3*x3 + 4*x4 + error and the error term is
> standard normal distributed.
>
> Hope this helps.
>
> Regards,
> Christien
>
> In this section, we provide a simulation study to illustrate the
> performance of four estimators, the (GLS), S, MM and MM ridge estimator for
> SUR model. This simulation process is executed to generate data for the
> following equation   Where  In this simulation, we set the initial value
> for β= [1,2,3] for k=3. The explanatory variables are generated by
> multivariate normal distribution MNNk=3 (0,∑x) where diag(∑x)=1,
> off-diag(∑x)= ρX= 0.15 for low interdependency and ρx= 0.70 for high
> interdependency. Where ρx is correlation between explanatory variables. We
> chose two sample size 25 for small sample and 100 for large sample. The
> specific error in equations μi, i=1,2,…..,n, we generated by MVNk=3 (0,
> ∑ε), ∑ε the variance covariance matrix of errors, diag(∑ε)= 1,
> off-diag(∑ε)= ρε= 0.15. To investigate the robustness of the estimators
> against outliers, we chosen different percentages of outliers ( 20%, 45%).
> We choose shrink parameter in (12) by minimize the new robust Cross
> Validation (CVMM) criterion which avoided
>
> 2018-03-04 0:52 GMT+01:00 David Winsemius :
>
> >
> > > On Mar 3, 2018, at 3:04 PM, Christien Kerbert <
> > christienkerb...@gmail.com> wrote:
> > >
> > > Dear list members,
> > >
> > > I want to perform an MM-regression. This seems an easy task using the
> > > function lmrob(), however, this function provides me with NA
> > coefficients.
> > > My data generating process is as follows:
> > >
> > > rho <- 0.15  # low interdependency
> > > Sigma <- matrix(rho, d, d); diag(Sigma) <- 1
> > > x.clean <- mvrnorm(n, rep(0,d), Sigma)
> >
> > Which package are you using for mvrnorm?
> >
> > > beta <- c(1.0, 2.0, 3.0, 4.0)
> > > error <- rnorm(n = n, mean = 0, sd = 1)
> > > y <- as.data.frame(beta[1]*rep(1, n) + beta[2]*x.clean[,1] +
> > > beta[3]*x.clean[,2] + beta[4]*x.clean[,3] + error)
> > > xy.clean <- cbind(x.clean, y)
> > > colnames(xy.clean) <- c("x1", "x2", "x3", "y")
> > >
> > > Then, I pass the following formula to lmrob: f <- y ~ x1 + x2 + x3
> > >
> > > Finally, I run lmrob: lmrob(f, data = data, cov = ".vcov.w")
> > > and this results in NA coefficients.
> >
> > It would also be more courteous to specify the package where you are
> > getting lmrob.
> >
> > >
> > > It would be great if anyone can help me out. Thanks in advance.
> > >
> > > Regards,
> > > Christien
> > >
> > >   [[alternative HTML version deleted]]
> >
> > This is a plain text mailing list although it doesn't seem to have
> created
> > problems this time.
> >
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius
> > Alameda, CA, USA
> >
> > 'Any technology distinguishable from magic is insufficiently advanced.'
> >  -Gehm's Corollary to Clarke's Third Law
> >
> >
> >
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lmrob gives NA coefficients

2018-03-04 Thread Eric Berger

Hard to help you if you don't provide a reproducible example.

On Sun, Mar 4, 2018 at 1:05 PM, Christien Kerbert <
christienkerb...@gmail.com> wrote:

> d is the number of observed variables (d = 3 in this example). n is the
> number of observations.
>
> 2018-03-04 11:30 GMT+01:00 Eric Berger :
>
>> What is 'd'? What is 'n'?
>>
>>
>> On Sun, Mar 4, 2018 at 12:14 PM, Christien Kerbert <
>> christienkerb...@gmail.com> wrote:
>>
>>> Thanks for your reply.
>>>
>>> I use mvrnorm from the *MASS* package and lmrob from the *robustbase*
>>> package.
>>>
>>> To further explain my data generating process, the idea is as follows.
>>> The
>>> explanatory variables are generated my a multivariate normal distribution
>>> where the covariance matrix of the variables is defined by Sigma in my
>>> code, with ones on the diagonal and rho = 0.15 on the non-diagonal. Then
>>> y
>>> is created by y = 1 - 2*x1 + 3*x3 + 4*x4 + error and the error term is
>>> standard normal distributed.
>>>
>>> Hope this helps.
>>>
>>> Regards,
>>> Christien
>>>
>>> In this section, we provide a simulation study to illustrate the
>>> performance of four estimators, the (GLS), S, MM and MM ridge estimator
>>> for
>>> SUR model. This simulation process is executed to generate data for the
>>> following equation   Where  In this simulation, we set the initial value
>>>
>>> for β= [1,2,3] for k=3. The explanatory variables are generated by
>>> multivariate normal distribution MNNk=3 (0,∑x) where diag(∑x)=1,
>>> off-diag(∑x)= ρX= 0.15 for low interdependency and ρx= 0.70 for high
>>> interdependency. Where ρx is correlation between explanatory variables.
>>> We
>>> chose two sample size 25 for small sample and 100 for large sample. The
>>> specific error in equations μi, i=1,2,…..,n, we generated by MVNk=3 (0,
>>> ∑ε), ∑ε the variance covariance matrix of errors, diag(∑ε)= 1,
>>> off-diag(∑ε)= ρε= 0.15. To investigate the robustness of the estimators
>>> against outliers, we chosen different percentages of outliers ( 20%,
>>> 45%).
>>> We choose shrink parameter in (12) by minimize the new robust Cross
>>> Validation (CVMM) criterion which avoided
>>>
>>> 2018-03-04 0:52 GMT+01:00 David Winsemius :
>>>
>>> >
>>> > > On Mar 3, 2018, at 3:04 PM, Christien Kerbert <
>>> > christienkerb...@gmail.com> wrote:
>>> > >
>>> > > Dear list members,
>>> > >
>>> > > I want to perform an MM-regression. This seems an easy task using the
>>> > > function lmrob(), however, this function provides me with NA
>>> > coefficients.
>>> > > My data generating process is as follows:
>>> > >
>>> > > rho <- 0.15  # low interdependency
>>> > > Sigma <- matrix(rho, d, d); diag(Sigma) <- 1
>>> > > x.clean <- mvrnorm(n, rep(0,d), Sigma)
>>> >
>>> > Which package are you using for mvrnorm?
>>> >
>>> > > beta <- c(1.0, 2.0, 3.0, 4.0)
>>> > > error <- rnorm(n = n, mean = 0, sd = 1)
>>> > > y <- as.data.frame(beta[1]*rep(1, n) + beta[2]*x.clean[,1] +
>>> > > beta[3]*x.clean[,2] + beta[4]*x.clean[,3] + error)
>>> > > xy.clean <- cbind(x.clean, y)
>>> > > colnames(xy.clean) <- c("x1", "x2", "x3", "y")
>>> > >
>>> > > Then, I pass the following formula to lmrob: f <- y ~ x1 + x2 + x3
>>> > >
>>> > > Finally, I run lmrob: lmrob(f, data = data, cov = ".vcov.w")
>>> > > and this results in NA coefficients.
>>> >
>>> > It would also be more courteous to specify the package where you are
>>> > getting lmrob.
>>> >
>>> > >
>>> > > It would be great if anyone can help me out. Thanks in advance.
>>> > >
>>> > > Regards,
>>> > > Christien
>>> > >
>>> > >   [[alternative HTML version deleted]]
>>> >
>>> > This is a plain text mailing list although it doesn't seem to have
>>> created
>>> > problems this time.
>>> >
>>> > >
>>> > > __
>>> > > R-help@r-project.org ma

Re: [R] Change Function based on ifelse() condtion

2018-03-04 Thread Eric Berger

Hi Christofer,
You cannot assign to list(...). You can do the following

myList <- list(...)[!names(list(...)) %in% 'mc.cores']

HTH,
Eric

On Sun, Mar 4, 2018 at 6:38 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> Hi,
>
> As an example, I want to create below kind of custom Function which
> either be mclapply pr lapply
>
> Lapply_me = function(X = X, FUN = FUN, ..., Apply_MC = FALSE) {
> if (Apply_MC) {
> return(mclapply(X, FUN, ...))
> } else {
> if (any(names(list(...)) == 'mc.cores')) {
> list(...) = list(...)[!names(list(...)) %in% 'mc.cores']
> }
> return(lapply(X, FUN, ...))
> }
> }
>
> However when Apply_MC = FALSE it generates below error saying :
>
>   '...' used in an incorrect context
>
>
> Appreciate if you can help me with the correct approach. Thanks,
>
>
> On Sun, Mar 4, 2018 at 9:34 PM, Duncan Murdoch 
> wrote:
> > On 04/03/2018 10:39 AM, Christofer Bogaso wrote:
> >>
> >> Hi again,
> >>
> >> I am looking for some way to alternately use 2 related functions,
> >> based on some ifelse() condition.
> >>
> >> For example, I have 2 functions mclapply() and lapply()
> >>
> >> However, mclapply() function has one extra parameter 'mc.cores' which
> >> lapply doesnt not have.
> >>
> >> I know when mc.cores = 1, these 2 functions are essentially same,
> >> however I am looking for more general way to control them within
> >> ifelse() constion
> >>
> >> Can someone please help me how can I use them within ifelse() condition.
> >
> >
> > Don't.  ifelse() usually evaluates *both* the true and false values, and
> > then selects entries from each.  Just use an if statement.
> >
> > Duncan Murdoch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change Function based on ifelse() condtion

2018-03-04 Thread Eric Berger

Hi Christofer,
Before you made the change that I suggested, your program was stopping at
the statement: list(...) = list(..) .etc
This means that it never tried to execute the statement:
return(lapply(X,FUN,...))
Now that you have made the change, it gets past the first statement and
tries to execute the statement: return(lapply(X,FUN,...)).
That attempt is generating the error message because whatever you are
passing in as the FUN argument is not expecting extra arguments.

HTH,
Eric


On Sun, Mar 4, 2018 at 6:52 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> @Eric - with this approach I am getting below error :
>
> Error in FUN(X[[i]], ...) : unused argument (list())
>
> On Sun, Mar 4, 2018 at 10:18 PM, Eric Berger 
> wrote:
> > Hi Christofer,
> > You cannot assign to list(...). You can do the following
> >
> > myList <- list(...)[!names(list(...)) %in% 'mc.cores']
> >
> > HTH,
> > Eric
> >
> > On Sun, Mar 4, 2018 at 6:38 PM, Christofer Bogaso
> >  wrote:
> >>
> >> Hi,
> >>
> >> As an example, I want to create below kind of custom Function which
> >> either be mclapply pr lapply
> >>
> >> Lapply_me = function(X = X, FUN = FUN, ..., Apply_MC = FALSE) {
> >> if (Apply_MC) {
> >> return(mclapply(X, FUN, ...))
> >> } else {
> >> if (any(names(list(...)) == 'mc.cores')) {
> >> list(...) = list(...)[!names(list(...)) %in% 'mc.cores']
> >> }
> >> return(lapply(X, FUN, ...))
> >> }
> >> }
> >>
> >> However when Apply_MC = FALSE it generates below error saying :
> >>
> >>   '...' used in an incorrect context
> >>
> >>
> >> Appreciate if you can help me with the correct approach. Thanks,
> >>
> >>
> >> On Sun, Mar 4, 2018 at 9:34 PM, Duncan Murdoch <
> murdoch.dun...@gmail.com>
> >> wrote:
> >> > On 04/03/2018 10:39 AM, Christofer Bogaso wrote:
> >> >>
> >> >> Hi again,
> >> >>
> >> >> I am looking for some way to alternately use 2 related functions,
> >> >> based on some ifelse() condition.
> >> >>
> >> >> For example, I have 2 functions mclapply() and lapply()
> >> >>
> >> >> However, mclapply() function has one extra parameter 'mc.cores' which
> >> >> lapply doesnt not have.
> >> >>
> >> >> I know when mc.cores = 1, these 2 functions are essentially same,
> >> >> however I am looking for more general way to control them within
> >> >> ifelse() constion
> >> >>
> >> >> Can someone please help me how can I use them within ifelse()
> >> >> condition.
> >> >
> >> >
> >> > Don't.  ifelse() usually evaluates *both* the true and false values,
> and
> >> > then selects entries from each.  Just use an if statement.
> >> >
> >> > Duncan Murdoch
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change Function based on ifelse() condtion

2018-03-04 Thread Eric Berger

That's fine. The issue is how you called Lapply_me(). What did you pass as
the argument to FUN?
And if you did not pass anything that how is FUN declared?
You have not shown that in your email.




On Sun, Mar 4, 2018 at 7:11 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> My modified function looks below :
>
> Lapply_me = function(X = X, FUN = FUN, Apply_MC = FALSE, ...) {
> if (Apply_MC) {
> return(mclapply(X, FUN, ...))
> } else {
> if (any(names(list(...)) == 'mc.cores')) {
> myList = list(...)[!names(list(...)) %in% 'mc.cores']
> }
> return(lapply(X, FUN, myList))
> }
> }
>
> Here, I am not passing ... anymore rather passing myList
>
> On Sun, Mar 4, 2018 at 10:37 PM, Eric Berger 
> wrote:
> > Hi Christofer,
> > Before you made the change that I suggested, your program was stopping at
> > the statement: list(...) = list(..) .etc
> > This means that it never tried to execute the statement:
> > return(lapply(X,FUN,...))
> > Now that you have made the change, it gets past the first statement and
> > tries to execute the statement: return(lapply(X,FUN,...)).
> > That attempt is generating the error message because whatever you are
> > passing in as the FUN argument is not expecting extra arguments.
> >
> > HTH,
> > Eric
> >
> >
> > On Sun, Mar 4, 2018 at 6:52 PM, Christofer Bogaso
> >  wrote:
> >>
> >> @Eric - with this approach I am getting below error :
> >>
> >> Error in FUN(X[[i]], ...) : unused argument (list())
> >>
> >> On Sun, Mar 4, 2018 at 10:18 PM, Eric Berger 
> >> wrote:
> >> > Hi Christofer,
> >> > You cannot assign to list(...). You can do the following
> >> >
> >> > myList <- list(...)[!names(list(...)) %in% 'mc.cores']
> >> >
> >> > HTH,
> >> > Eric
> >> >
> >> > On Sun, Mar 4, 2018 at 6:38 PM, Christofer Bogaso
> >> >  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> As an example, I want to create below kind of custom Function which
> >> >> either be mclapply pr lapply
> >> >>
> >> >> Lapply_me = function(X = X, FUN = FUN, ..., Apply_MC = FALSE) {
> >> >> if (Apply_MC) {
> >> >> return(mclapply(X, FUN, ...))
> >> >> } else {
> >> >> if (any(names(list(...)) == 'mc.cores')) {
> >> >> list(...) = list(...)[!names(list(...)) %in% 'mc.cores']
> >> >> }
> >> >> return(lapply(X, FUN, ...))
> >> >> }
> >> >> }
> >> >>
> >> >> However when Apply_MC = FALSE it generates below error saying :
> >> >>
> >> >>   '...' used in an incorrect context
> >> >>
> >> >>
> >> >> Appreciate if you can help me with the correct approach. Thanks,
> >> >>
> >> >>
> >> >> On Sun, Mar 4, 2018 at 9:34 PM, Duncan Murdoch
> >> >> 
> >> >> wrote:
> >> >> > On 04/03/2018 10:39 AM, Christofer Bogaso wrote:
> >> >> >>
> >> >> >> Hi again,
> >> >> >>
> >> >> >> I am looking for some way to alternately use 2 related functions,
> >> >> >> based on some ifelse() condition.
> >> >> >>
> >> >> >> For example, I have 2 functions mclapply() and lapply()
> >> >> >>
> >> >> >> However, mclapply() function has one extra parameter 'mc.cores'
> >> >> >> which
> >> >> >> lapply doesnt not have.
> >> >> >>
> >> >> >> I know when mc.cores = 1, these 2 functions are essentially same,
> >> >> >> however I am looking for more general way to control them within
> >> >> >> ifelse() constion
> >> >> >>
> >> >> >> Can someone please help me how can I use them within ifelse()
> >> >> >> condition.
> >> >> >
> >> >> >
> >> >> > Don't.  ifelse() usually evaluates *both* the true and false
> values,
> >> >> > and
> >> >> > then selects entries from each.  Just use an if statement.
> >> >> >
> >> >> > Duncan Murdoch
> >> >>
> >> >> __
> >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide
> >> >> http://www.R-project.org/posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> >
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change Function based on ifelse() condtion

2018-03-04 Thread Eric Berger

The reason that it works for Apply_MC=TRUE is that in that case you call
mclapply(X,FUN,...) and
the mclapply() function strips off the mc.cores argument from the "..."
list before calling FUN, so FUN is being called with zero arguments,
exactly as it is declared.

A quick workaround is to change the line

Lapply_me(as.list(1:4), function(xx) {

to

Lapply_me(as.list(1:4), function(xx,dummyList) {

HTH,
Eric


On Sun, Mar 4, 2018 at 7:21 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> Below is my full implementation (tried to make it simple as for
> demonstration)
>
> Lapply_me = function(X = X, FUN = FUN, Apply_MC = FALSE, ...) {
> if (Apply_MC) {
> return(mclapply(X, FUN, ...))
> } else {
> if (any(names(list(...)) == 'mc.cores')) {
> myList = list(...)[!names(list(...)) %in% 'mc.cores']
> }
> return(lapply(X, FUN, myList))
> }
> }
>
>
> Lapply_me(as.list(1:4), function(xx) {
> if (xx == 1) return('a')
> if (xx == 2) return('b')
> if (xx == 3) return('c')
> if (xx == 4) return('d')
> }, Apply_MC = FALSE, mc.cores = 2)
>
> Error message :
>
> Error in FUN(X[[i]], ...) : unused argument (list())
>
> Kindly note that, with Apply_MC = TRUE, it is working perfectly.
>
> On Sun, Mar 4, 2018 at 10:45 PM, Eric Berger 
> wrote:
> > That's fine. The issue is how you called Lapply_me(). What did you pass
> as
> > the argument to FUN?
> > And if you did not pass anything that how is FUN declared?
> > You have not shown that in your email.
> >
> >
> >
> >
> > On Sun, Mar 4, 2018 at 7:11 PM, Christofer Bogaso
> >  wrote:
> >>
> >> My modified function looks below :
> >>
> >> Lapply_me = function(X = X, FUN = FUN, Apply_MC = FALSE, ...) {
> >> if (Apply_MC) {
> >> return(mclapply(X, FUN, ...))
> >> } else {
> >> if (any(names(list(...)) == 'mc.cores')) {
> >> myList = list(...)[!names(list(...)) %in% 'mc.cores']
> >> }
> >> return(lapply(X, FUN, myList))
> >> }
> >> }
> >>
> >> Here, I am not passing ... anymore rather passing myList
> >>
> >> On Sun, Mar 4, 2018 at 10:37 PM, Eric Berger 
> >> wrote:
> >> > Hi Christofer,
> >> > Before you made the change that I suggested, your program was stopping
> >> > at
> >> > the statement: list(...) = list(..) .etc
> >> > This means that it never tried to execute the statement:
> >> > return(lapply(X,FUN,...))
> >> > Now that you have made the change, it gets past the first statement
> and
> >> > tries to execute the statement: return(lapply(X,FUN,...)).
> >> > That attempt is generating the error message because whatever you are
> >> > passing in as the FUN argument is not expecting extra arguments.
> >> >
> >> > HTH,
> >> > Eric
> >> >
> >> >
> >> > On Sun, Mar 4, 2018 at 6:52 PM, Christofer Bogaso
> >> >  wrote:
> >> >>
> >> >> @Eric - with this approach I am getting below error :
> >> >>
> >> >> Error in FUN(X[[i]], ...) : unused argument (list())
> >> >>
> >> >> On Sun, Mar 4, 2018 at 10:18 PM, Eric Berger 
> >> >> wrote:
> >> >> > Hi Christofer,
> >> >> > You cannot assign to list(...). You can do the following
> >> >> >
> >> >> > myList <- list(...)[!names(list(...)) %in% 'mc.cores']
> >> >> >
> >> >> > HTH,
> >> >> > Eric
> >> >> >
> >> >> > On Sun, Mar 4, 2018 at 6:38 PM, Christofer Bogaso
> >> >> >  wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> As an example, I want to create below kind of custom Function
> which
> >> >> >> either be mclapply pr lapply
> >> >> >>
> >> >> >> Lapply_me = function(X = X, FUN = FUN, ..., Apply_MC = FALSE) {
> >> >> >> if (Apply_MC) {
> >> >> >> return(mclapply(X, FUN, ...))
> >> >> >> } else {
> >> >> >> if (any(names(list(...)) == 'mc.cores')) {
> >> >> >> list(...) = list(...)[!names(list(...)) %in% 'mc.cores']
> >> >> >> }
> >> >> >> return(lapply(X, FUN, ...))
> >> >> >> }
&g

Re: [R] add single points to a level plot

2018-03-08 Thread Eric Berger

You need to load the package 'rasterVis'

> library(rasterVis)

HTH,
Eric


On Thu, Mar 8, 2018 at 5:11 PM, lily li  wrote:

> Hi all,
>
> I ran the code:
> > s <- stack(replicate(2, raster(matrix(runif(100), 10
> > xy <- data.frame(coordinates(sampleRandom(s, 10, sp=TRUE)),
> +  z1=runif(10), z2=runif(10))
> > levelplot(s, margin=FALSE, at=seq(0, 1, 0.05)) +
> +   layer(sp.points(xy, pch=ifelse(pts$z1 < 0.5, 2, 3), cex=2, col=1),
> columns=1) +
> +   layer(sp.points(xy, pch=ifelse(pts$z2 < 0.5, 2, 3), cex=2, col=1),
> columns=2)
>
> And got the error:
> Error in UseMethod("levelplot") :
>   no applicable method for 'levelplot' applied to an object of class
> "c('RasterStack', 'Raster', 'RasterStackBrick', 'BasicRaster')"
>
> what is the problem? Thanks.
>
> On Thu, Mar 8, 2018 at 12:07 AM, lily li  wrote:
>
> > Hi all,
> >
> > I'm trying to add single points with known coordinates to a level plot,
> > but could not find the proper answer. I got to know that layer() function
> > is good for this, but I don't know which package is related to this
> > function. The source is here:
> > https://stackoverflow.com/questions/28597149/add-xy-
> points-to-raster-map-
> > generated-by-levelplot
> >
> > but my question is a little different as I know the coordinates of the
> > single point, rather than a range. Thanks for any help you could provide.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Learning advanced R

2018-03-14 Thread Eric Berger

Bert's suggestion is good as a pointer to a variety of resources.
Sticking to the book format there are two of Hadley Wickham's books, which
have the advantage that they are freely available.
You can either read them online or download the source from github and
create your own copy (which you can then print, if desired.)
1. "R for Data Science"
 online: http://r4ds.had.co.nz/
 github: https://github.com/hadley/r4ds
2. "Advanced R"
 online: https://adv-r.hadley.nz/
 github: https://github.com/hadley/adv-r

Best,
Eric

On Wed, Mar 14, 2018 at 12:13 AM, Rich Shepard 
wrote:

> On Tue, 13 Mar 2018, Mark Leeds wrote:
>
> See Hadley's advanced R
>>
>
>   +1 A very well writte, highly useful book. Recommended.
>
> Rich
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the same function returning different values when called differently..

2018-03-14 Thread Eric Berger

Hi Akshay,
Presumably PFC.NS and snl[[159]] are not exactly the same.
You can start by trying to determine if (and then how) they differ.
e.g.
> identical(PFC.NS, snl[[159]])
presumably this will result in FALSE

then compare
> class(PFC.NS)
> class(snl[[159]])

etc

HTH,
Eric


On Wed, Mar 14, 2018 at 1:42 PM, akshay kulkarni 
wrote:

> dear members,
>
>  I have a function ygrpc which acts on the daily price increments of a
> stock. It returns the following values:
>
>
>  ygrpc(PFC.NS,"h")
>  [1]  2.149997  1.875000  0.75  0.349991  2.16  0.17
> 4.00  2.574996  0.50  0.34  1.50  0.71
> [13]  0.50  1.33  0.449997  2.83  2.724998 66.150002
> 0.550003  0.050003  1.224991  4.84  1.375000  1.574997
> [25]  1.649994  0.449997  0.975006  2.475006  0.125000  2.625000
> 3.649994  0.34  1.33  2.074997  1.025001  0.125000
> [37]  3.84  0.025002  0.824997  1.75  1.67  1.75
> 1.67  0.275002  2.33  0.349998  0.75  0.224998
> [49]  0.125000  1.475006  1.58  0.125000  0.50  0.75
> 1.08  0.225006  1.274997  1.33  3.00  0.33
> [61]  0.724999  3.67  2.424995  8.425003  1.01  2.025009
> 0.850006  4.00  1.724991  0.949997  1.825012  2.799988
> [73]  0.425003  1.75  5.75  2.125000  0.125000  4.00
> 2.350006  1.524994  1.25  0.33  0.949997  0.449997
> [85]  1.84  1.75  1.150009  0.849998  2.449997  5.33  0.1
>
> I also have a list of stocks called "snl" with snl[[159]] pointing to
> PFC.NS (above):
> tail(snl[[159]])
>PFC.NS.Open PFC.NS.High PFC.NS.Low PFC.NS.Close PFC.NS.Volume
> PFC.NS.Adjusted
> 2018-03-07   98.40   98.45   95.195.30   7854861
>  95.30
> 2018-03-08   94.90   94.95   89.391.90  12408061
>  91.90
> 2018-03-09   92.00   94.50   91.993.10   7680222
>  93.10
> 2018-03-12   93.40   93.85   86.188.25  12617833
>  88.25
> 2018-03-13   89.20   91.85   86.289.85  12792630
>  89.85
> 2018-03-14   88.65   89.30   86.186.70  16406495
>  86.70
>
> But ygrpc(snl[[159]],"h") returns :
> ygrpc(snl[[159]],"h")
> [1] 1
>
> Can you please shed some light on what is happening?
>
> Very many thanks for your time and effort
>
> Yours sincerely
> AKSHAY M KULKARNI
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R crashing with a segmentation fault: how to locate the cause

2018-03-14 Thread Eric Berger

I have a littler script which is crashing with a segmentation fault.
I tried to find out why by running it through valgrind, which produced the
output below.
I am not sure how to proceed from here (other than binary search with print
statements).
Any help would be appreciated.

Thanks,
Eric

==12589== Invalid read of size 1
==12589==at 0x4C2F1B1: strcmp (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12589==by 0x4F71AC1: Rf_inherits (in /usr/lib/R/lib/libR.so)
==12589==by 0x11AFED3A: dplyr::subset_visitor_vector(SEXPREC*)
(subset_visitor_impl.h:51)
==12589==by 0x11AFF58C: dplyr::subset_visitor(SEXPREC*,
dplyr::SymbolString const&) (subset_visitor_impl.h:21)
==12589==by 0x11AFEC18:
dplyr::DataFrameSubsetVisitors::DataFrameSubsetVisitors(Rcpp::DataFrame_Impl
const&) (DataFrameSubsetVisitors.h:33)
==12589==by 0x11B1F773: subset
> (DataFrameSubsetVisitors.h:120)
==12589==by 0x11B1F773:
filter_ungrouped(Rcpp::DataFrame_Impl,
dplyr::NamedQuosure const&) (filter.cpp:89)
==12589==by 0x11B1FD89:
filter_impl(Rcpp::DataFrame_Impl,
dplyr::NamedQuosure) (filter.cpp:106)
==12589==by 0x11AD45CB: _dplyr_filter_impl (RcppExports.cpp:192)
==12589==by 0x4F0DBAC: ??? (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F50E50: Rf_eval (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F534FF: ??? (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F50C28: Rf_eval (in /usr/lib/R/lib/libR.so)
==12589==  Address 0x2c is not stack'd, malloc'd or (recently) free'd
==12589==
==12589==
==12589== Process terminating with default action of signal 11 (SIGSEGV)
==12589==  Access not within mapped region at address 0x2C
==12589==at 0x4C2F1B1: strcmp (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12589==by 0x4F71AC1: Rf_inherits (in /usr/lib/R/lib/libR.so)
==12589==by 0x11AFED3A: dplyr::subset_visitor_vector(SEXPREC*)
(subset_visitor_impl.h:51)
==12589==by 0x11AFF58C: dplyr::subset_visitor(SEXPREC*,
dplyr::SymbolString const&) (subset_visitor_impl.h:21)
==12589==by 0x11AFEC18:
dplyr::DataFrameSubsetVisitors::DataFrameSubsetVisitors(Rcpp::DataFrame_Impl
const&) (DataFrameSubsetVisitors.h:33)
==12589==by 0x11B1F773: subset
> (DataFrameSubsetVisitors.h:120)
==12589==by 0x11B1F773:
filter_ungrouped(Rcpp::DataFrame_Impl,
dplyr::NamedQuosure const&) (filter.cpp:89)
==12589==by 0x11B1FD89:
filter_impl(Rcpp::DataFrame_Impl,
dplyr::NamedQuosure) (filter.cpp:106)
==12589==by 0x11AD45CB: _dplyr_filter_impl (RcppExports.cpp:192)
==12589==by 0x4F0DBAC: ??? (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F50E50: Rf_eval (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F534FF: ??? (in /usr/lib/R/lib/libR.so)
==12589==by 0x4F50C28: Rf_eval (in /usr/lib/R/lib/libR.so)
==12589==  If you believe this happened as a result of a stack
==12589==  overflow in your program's main thread (unlikely but
==12589==  possible), you can try to increase the size of the
==12589==  main thread stack using the --main-stacksize= flag.
==12589==  The main thread stack size used in this run was 8388608.
==12589==
==12589== HEAP SUMMARY:
==12589== in use at exit: 224,025,975 bytes in 111,522 blocks
==12589==   total heap usage: 1,104,400 allocs, 992,878 frees, 625,925,991
bytes allocated
==12589==
==12589== LEAK SUMMARY:
==12589==definitely lost: 0 bytes in 0 blocks
==12589==indirectly lost: 0 bytes in 0 blocks
==12589==  possibly lost: 18,724 bytes in 47 blocks
==12589==still reachable: 224,007,251 bytes in 111,475 blocks
==12589== suppressed: 0 bytes in 0 blocks
==12589== Rerun with --leak-check=full to see details of leaked memory
==12589==
==12589== For counts of detected and suppressed errors, rerun with: -v
==12589== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: the same function returning different values when called differently..

2018-03-14 Thread Eric Berger

Hi Akshay,

(Please include r-help when replying)

You have learned that PFC.NS and snl[[159]] are not identical. Now you have
to figure out why they differ. This could also point to a bug or a logic
error in your program.
Figuring out how two objects differ can be a bit tricky, but with
experience it becomes easier. (Some others may even have some suggestions
for good ways to do it.)
Basically you would work your way down. At the top level is the class of
each, which you already tested and they are identical.

Now try:

> str(PFC.NS)

and compare it to

> str(snl[[159]]

Look closely at the two outputs to try to detect differences. If they are
the same then you will have to examine the sub-objects described by str().
You didn't mention what type of objects they are. Suppose they are both
"numeric" vectors. Then you can check whether their lengths are equal.
And you can compare their values, etc. etc. No short cut that I can think
of without more information.

It definitely takes work to find discrepancies. Think of it as a challenge
and perhaps write functions that you can use to automate this kind of
comparison in the future.
(Again, other people on this list might be able to point to tools that help
with this for objects of specific type.)

Good luck,
Eric

dear Eric,

   Bulls eye...! identical(PFC.NS, snl[[159]])) is
resulting false..but class(PFC.NS) and class(snl[[159]]) are same...

I want snl[[159]] to be equal to PFC.NS...how do I effect it? create a list
with some other element list... for example snl[[200]] == PFC.NS?

very many thanks for your time and effort...

yours sincerely,

AKSHAY M KULKARNI

------
*From:* Eric Berger 
*Sent:* Wednesday, March 14, 2018 5:22 PM
*To:* akshay kulkarni
*Cc:* R help Mailing list
*Subject:* Re: [R] the same function returning different values when called
differently..

Hi Akshay,
Presumably PFC.NS and snl[[159]] are not exactly the same.
You can start by trying to determine if (and then how) they differ.
e.g.
> identical(PFC.NS, snl[[159]])
presumably this will result in FALSE

then compare
> class(PFC.NS)
> class(snl[[159]])

etc

HTH,
Eric

On Wed, Mar 14, 2018 at 1:42 PM, akshay kulkarni 
wrote:

dear members,

 I have a function ygrpc which acts on the daily price increments of a
stock. It returns the following values:

 ygrpc(PFC.NS,"h")
 [1]  2.149997  1.875000  0.75  0.349991  2.16  0.17  4.00
2.574996  0.50  0.34  1.50  0.71
[13]  0.50  1.33  0.449997  2.83  2.724998 66.150002  0.550003
0.050003  1.224991  4.84  1.375000  1.574997
[25]  1.649994  0.449997  0.975006  2.475006  0.125000  2.625000  3.649994
0.34  1.33  2.074997  1.025001  0.125000
[37]  3.84  0.025002  0.824997  1.75  1.67  1.75  1.67
0.275002  2.33  0.349998  0.75  0.224998
[49]  0.125000  1.475006  1.58  0.125000  0.50  0.75  1.08
0.225006  1.274997  1.33  3.00  0.33
[61]  0.724999  3.67  2.424995  8.425003  1.01  2.025009  0.850006
4.00  1.724991  0.949997  1.825012  2.799988
[73]  0.425003  1.75  5.75  2.125000  0.125000  4.00  2.350006
1.524994  1.25  0.33  0.949997  0.449997
[85]  1.84  1.75  1.150009  0.849998  2.449997  5.33  0.1

I also have a list of stocks called "snl" with snl[[159]] pointing to
PFC.NS (above):
tail(snl[[159]])
   PFC.NS.Open PFC.NS.High PFC.NS.Low PFC.NS.Close PFC.NS.Volume
PFC.NS.Adjusted
2018-03-07   98.40   98.45   95.195.30   7854861
   95.30
2018-03-08   94.90   94.95   89.391.90  12408061
   91.90
2018-03-09   92.00   94.50   91.993.10   7680222
   93.10
2018-03-12   93.40   93.85   86.188.25  12617833
   88.25
2018-03-13   89.20   91.85   86.289.85  12792630
   89.85
2018-03-14   88.65   89.30   86.186.70  16406495
   86.70

But ygrpc(snl[[159]],"h") returns :
ygrpc(snl[[159]],"h")
[1] 1

Can you please shed some light on what is happening?

Very many thanks for your time and effort

Yours sincerely
AKSHAY M KULKARNI

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
R-help -- Main R Mailing List: Primary help - Homepage - SfS
<https://stat.ethz.ch/mailman/listinfo/r-help>
stat.ethz.ch
The main R mailing list, for announcements about the development of R and
the availability of new code, questions and answers about problems and
solutions using R ...

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version de

[R] Fwd: Learning advanced R

2018-03-14 Thread Eric Berger

Hi Albrecht,
I am forwarding your reply to the full group.

It's been a while since I did this and I don't remember the details. Maybe
someone else can comment. (I am a bit busy at the moment.)
If no one supplies the information in a few days I will try to take a look.

In the meantime you can start your reading on-line. :-)

Regards,
Eric

Dear Eric,

I downloaded the material from   https://github.com/hadley/adv-r as a zip
file and decompressed it.  But, how to build the book from this? The
directory book contains a R-script buildbook.R. I downloaded all packages
that are required, but the script does not run. Is there an additional
script required?

Best,
Albrecht

--
  Albrecht Kauffmann
  alkau...@fastmail.fm

Am Mi, 14. Mär 2018, um 09:13, schrieb Eric Berger:
> Bert's suggestion is good as a pointer to a variety of resources.
> Sticking to the book format there are two of Hadley Wickham's books, which
> have the advantage that they are freely available.
> You can either read them online or download the source from github and
> create your own copy (which you can then print, if desired.)
> 1. "R for Data Science"
>  online: http://r4ds.had.co.nz/
>  github: https://github.com/hadley/r4ds
> 2. "Advanced R"
>  online: https://adv-r.hadley.nz/
>  github: https://github.com/hadley/adv-r
>
> Best,
> Eric
>
>
>
> On Wed, Mar 14, 2018 at 12:13 AM, Rich Shepard 
> wrote:
>
> > On Tue, 13 Mar 2018, Mark Leeds wrote:
> >
> > See Hadley's advanced R
> >>
> >
> >   +1 A very well writte, highly useful book. Recommended.
> >
> > Rich
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posti
> > ng-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stats 'dist' euclidean distance calculation

2018-03-15 Thread Eric Berger

Hi Cheyenne,

I noticed one thing that might be helpful to you.
First, I took a shortcut to the case of interest:

> m <- matrix(c(2,1,1,0,1,1,NA,1,1,NA,1,1,2,1,2,0,1,0),nrow=3)
> colnames(m) <- c("1.G","1.A","2.C","2.A","3.G","3.A")
> m
#   1.G  1.A  2.C  2.A  3.G  3.A
# [1,]   20 NA   NA   2 0
# [2,]   1   1  1  1  1 1
# [3,]   1   1  1  1  2 0

Computing the distance between the different rows by hand - TREATING THE
NA's AS ZERO -
would give:
dist(row1,row2) = sqrt( 1^2 + 1^2 + 1^2 + 1^2 + 1^2 + 1^2) = sqrt(6) = 2.45
dist(row1,row3) = sqrt(  1^2 + 1^2 + 1^2 + 1^2 + 0^2 + 0^2) = sqrt(4) = 2
dist(row2,row3) = sqrt(   0^2 + 0^2 + 0^2 + 0^2 + 1^2 + 1^2) = sqrt(2) =
1.414

Doing the same calculation with the dist() function gives
> dist(m)
#1  2
# 22.45
# 31.73  1.414

i.e. the results match with the manual calculation for dist(row1,row2) and
dist(row2,row3).
However for dist(row1,row3) which should be 2, the dist function gives 1.73
= sqrt(3).
Clearly sqrt(3) makes no sense since |1 - NA|^2 appears twice. Either both
times it should get a value of 1 or neither time. Why only once? Not clear
to me and I did not see any hints on ?dist.

However if you replace the NA's by actual 0's, which seems to be your
preferred methodology, then the problem is "solved", i.e.
> m2 <- m
> m2[1,][is.na(m2[1,])] <- 0
> dist(m2)
#1  2
# 22.45
# 32  1.414

HTH,
Eric



On Thu, Mar 15, 2018 at 2:11 AM, Cheyenne Yancey Payne  wrote:

> Hello,
>
> I am working with a matrix of multilocus genotypes for ~180 individual
> snail samples, with substantial missing data. I am trying to calculate the
> pairwise genetic distance between individuals using the stats package
> 'dist' function, using euclidean distance. I took a subset of this dataset
> (3 samples x 3 loci) to test how euclidean distance is calculated:
>
> 3x3 subset used
>  Locus1 Locus2 Locus3
> Samp1   GG  GG
> Samp2   AG CA  GA
> Samp3   AG CA  GG
>
> The euclidean distance function is defined as: sqrt(sum((x_i - y_i)^2))
> My assumption was that the difference between x_i and y_i would be the
> number of allelic differences at each base pair site between samples. For
> example, the euclidean distance between Samp1 and Samp2 would be:
>
> euclidean distance = sqrt( S1_L1 - S2_L1)^2 + (S1_L2 - S2_L2)^2 + (S1_L3 -
> S2_L3)^2 )
> at locus 1: GG - AG --> one basepair difference --> (1)^2 = 1
> at locus 2:  - CA --> two basepair differences --> (2)^2 = 4
> at locus 3: GG - GA --> one basepair difference --> (1)^2 = 1
>
> euclidean distance = sqrt( 1 + 4 + 1 ) = sqrt( 6 ) = 2.44940
>
> Calculating euclidean distances this way, the distance matrix should be:
> #   Samp1   Samp2 Samp3
> # Samp1   0.00   2.449400  2.236068
> # Samp2   2.449400   0.00  1.00
> # Samp3   2.236068   1.00  0.00
>
> However, this distance matrix differs from the one calculated by the R
> stats package 'dist' function:
> #   Samp1   Samp2 Samp3
> # Samp1   0.00   3.478652  2.659285
> # Samp2   3.478652   0.00  2.124787
> # Samp3   2.659285   2.124787  0.00
>
> I used the following code (with intermediate output) to achieve the latter
> distance matrix:
>
>
> >>>
> setwd("~/Desktop/R_stuff")
>
> ### Data Prep: load and collect info from matrix file
> infile<-'~/Desktop/R_stuff/good_conch_mplex_03052018.txt'
> Mydata <- read.table(infile, header = TRUE, check.names = FALSE)
> dim(Mydata) # dimensions of data.frame
> ind <- as.character(Mydata$sample) # individual labels
> population <- as.character(Mydata$region) # population labels
> location <- Mydata$location
>
> ### Section 1: Convert data to usable format
> # removes non-genotype data from matrix (i.e. lines 1-4)
> # subset 3 samples, 3 loci for testing
> SAMPS<-3
> locus <- Mydata[c(1:SAMPS), -c(1, 2, 3, 4, 5+SAMPS:ncol(Mydata))]
> locus
> #   Locus1 Locus2 Locus3
> # Samp1   GG  GG
> # Samp2   AG CA  GA
> # Samp3   AG CA  GG
>
> # converts geno matrix to genind object (adegenet)
> Mydata1 <- df2genind(locus, ploidy = 2, ind.names = ind[1:SAMPS], pop =
> population[1:SAMPS], sep="")
> Mydata1$tab # get stats on newly created genind object
> #  Locus1.G Locus1.ALocus2.CLocus2.A
> Locus3.GLocus3.A
> # Samp1  2  0   NA
>  NA 2 0
> # Samp2  1  11

Re: [R] Creating the right table from lapply list

2018-03-29 Thread Eric Berger

I like Don's answer which is clean and clear. A frequently useful
alternative to lapply() is sapply().
Taking the sapply() route also avoids the need for do.call(). So a modified
version of Don's code would be:

## example data
output <- list(1:5, 1:7, 1:4)

maxlen <- max( sapply(output, length) )
outputmat <-  sapply(output, function(x, maxl) c(x, rep(NA,
maxl-length(x))), maxl=maxlen)
write.csv(outputmat, na='')


On Thu, Mar 29, 2018 at 2:18 AM, MacQueen, Don  wrote:

> Perhaps this toy example will help:
>
> ## example data
> output <- list(1:5, 1:7, 1:4)
>
> lens <- lapply(output, length)
> maxlen <- max(unlist(lens))
> outputmod <- lapply(output, function(x, maxl) c(x, rep(NA,
> maxl-length(x))), maxl=maxlen)
> outputmat <- do.call(cbind, outputmod)
> write.csv(outputmat, na='')
>
> The idea is to pad the shorter vectors with NA (missing) before converting
> to a matrix structure.
>
> I don't really need to know where the data came from, or that it's ncdf
> data, or how many months or years, etc. But I do need to know the structure
> of your "output" list. I'm assuming each element is a simple vector of
> numbers, and that the vectors' lengths are not all the same. If that's
> correct, then my example may be what you need.
>
> This uses only base R methods, which I generally prefer. And no doubt it
> can be done more cleverly, or in a way that needs fewer intermediate
> variables ... but I don't really care.
>
> -Don
>
> --
> Don MacQueen
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> Lab cell 925-724-7509
>
>
> On 3/28/18, 9:32 AM, "R-help on behalf of orlin mitov via R-help" <
> r-help-boun...@r-project.org on behalf of r-help@r-project.org> wrote:
>
> Hello,
>   I have no previous experience with R, but had to learn on the fly in
> the past couple of weeks. Basically, what I am trying to do is read a
> certain variable from a series of files and save it as csv-table. The
> variable has an hourly value for each month in a year for the past 20 years
> and has to be read for different geographical locations. So there will be
> 12 files per year (1 for each month) and the values for the variable from
> each file will be 696 to 744 (depending on how many days x 24 hours there
> were in the month).What I achieved is to to read the values from all 12
> files stored in directory with a function and add them as vectors to a
> lapply-list:
>
>
>
> Myfunction <- function(filename) {
>  nc <- nc_open(filename)
>  lon <- ncvar_get(nc, "lon")
>  lat <- ncvar_get(nc, "lat")
>  RW <- ncvar_get(nc, "X")
>  HW <- ncvar_get(nc, "Y")
>  pt.geo <- c(8.6810 , 50.1143)
>  dist <- sqrt( (lon - pt.geo[1])^2 + (lat - pt.geo[2])^2 )
>  ind <- which(dist==min(dist, na.rm=TRUE),arr.ind=TRUE)
>  sis <- ncvar_get(nc, "SIS", start=c(ind[1],ind[2],1), count=c(1,1,-1))
>  vec <- c(sis)
> }
>
> filenames <- list.files(path = "C:/Users/Desktop/VD/Solardaten/NC",
> pattern = "nc", full.names = TRUE)
>  output <- lapply(filenames, Myfunction)
>
>
>
> And here start my problems with saving "output" as a csv table. Output
> would contain 12 vectors of different lenght.I want to have them as 12
> columns (1x per month) in Excel and each column should have as many
> row-entries as there are values for this month.Whatever I tried with
> write.table I was not able to achieve this (tried converting the output to
> a matrix, also no successes).Please help! Or should I be trying to have the
> 12 elements as data frames and not vectors?
> This is how I want the table for each year to look - 12 columns and
> all the respective values in the rows (column names I can add by myself):
> Best regardsOrlin
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert numeric variables to factor

2018-04-10 Thread Eric Berger

You are missing a comma between "MARITAL" and "JOBSTATUS".

On Tue, Apr 10, 2018 at 10:27 AM, Saif Tauheed 
wrote:

> I run this command for converting the numerical variable into factor.
> However, I get the following error message.
>
> > cols<- c(“GrMM", "RELG", "CASTE1", "SECTOR", "SECTOR4","AGE", "MARITAL"
> "JOBSTATUS", "ENG", "EDU", "PARENT_EDU", "MASSMEDIA_F", "MASSMEDIA_M",
> "HomeComputer", "HomeInternet") for (i in cols) {data.frame[,i]=
> as.factor(data.frame[,i])}
>
>
> Error: unexpected string constant in “cols<- c(“GrMM", "RELG", "CASTE1",
> "SECTOR", "SECTOR4","AGE", "MARITAL" "JOBSTATUS""
>
> Please help.
>
> Regards
> Afzal
>
>
> > On 10-Apr-2018, at 12:14 AM, Rui Barradas  wrote:
> >
> > Hello,
> >
> > Though Bert's and David's answers are what you should do, note that some
> R functions that need factors will coerce their input variables when
> necessary.
> > Have you tried to run the code you haven't posted without coercing to
> factor? It might run...
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > On 4/9/2018 6:11 PM, David L Carlson wrote:
> >> Try the help files:
> >> ?factor
> >> 
> >> David L Carlson
> >> Department of Anthropology
> >> Texas A&M University
> >> College Station, TX 77843-4352
> >> -Original Message-
> >> From: R-help  On Behalf Of Saif Tauheed
> >> Sent: Monday, April 9, 2018 11:29 AM
> >> To: r-help@r-project.org
> >> Subject: Re: [R] convert numeric variables to factor
> >> Dear Sir,
> >> I have xlsx data set which I have imported to R studio. Now some of the
> variables are defined as numeric but I want define them as factor variable
> so that I run classification algorithm in R.
> >> Please to covert the variables.
> >> Thanks and Regards
> >> Abu Afzal
> >> PhD Eco
> >> JNU
> >> India
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Understanding which

2018-04-18 Thread Eric Berger

Here's a hint:

> y <- which(x>100)
> identical(y,y)
# TRUE
> identical(y,-y)
# TRUE

The '-' is misleading - it is absorbed into the empty y, leaving the
request x[y] to be x for an empty set of indices.

HTH,
Eric

On Wed, Apr 18, 2018 at 2:13 PM, Ashim Kapoor  wrote:

> Dear All,
>
> Here is a reprex:
>
> > x<- 1:100
> > x[-which(x>100)]
> integer(0)
>
> In words, I am finding out which indices correspond to values in x which
> are  greater than  100 ( there are no such items ) . Then I remove those
> indices. I should get back the x that I started with since there are no
> items in x which are bigger than 100 . Instead, it is returning an empty
> vector.
>
> Why is this ? What am I misunderstanding?
>
> Best Regards,
> Ashim
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Check if row of dataframe is superset of any row in another dataframe.

2018-04-21 Thread Eric Berger

Hi Neha,
How about this?

A <- as.matrix(A)
B <- as.matrix(B)

C   <- A %*% t(B)
SA  <- apply(A, MAR=1, sum )
SB  <- apply(B, MAR=1, sum )
vapply( 1:nrow(B), function(j) { sum( C[,j]==SA & SA <= SB[j] ) > 0 }, 1 )

HTH,
Eric


On Sat, Apr 21, 2018 at 10:27 AM, Neha Aggarwal 
wrote:

> Hi,
>
> I am looking for a way in which I can check if rows in 1 dataframe are
> present in another data frame in a unique way. A row in dataframe should be
> super set of any row in another dataframe.
>
> I can write a for loop for it, however, that will be inefficient. So, I am
> looking for an efficient way to do this in R.
>
> I have explained it with an example below:
>
> I want to check if a row in dataframe B is:
> 1) either equal to any row in A or
> 2) has 1's atleast for the columns where (any) row in B has 1's.
>
> My output/result is a vector of 1(TRUE) or 0(FALSE) of length equal to
> number of rows in B. The first row in B is exactly present in A so result
> has first bit as 1. Second row in B has matches with 2nd row of dataframe A
> (it has an extra 1 in 3rd column,which is ok);so second bit of result is
> also 1. Similarly, the 3rd row of B, can match to any row in A, so 3rd bit
> in result is also a 1. Next, 4th row in B has 1 for a column where no row
> in A has 1, so last bit in result is 0.
>
> Dataframe A
> 1 0 1 0
> 1 1 0 0
> 0 1 1 0
>
> Dataframe B
> 1 0 1 0
> 1 1 1 0
> 1 1 1 1
> 0 0 0 1
>
> Result<- 1 1 1 0
>
> Thanks for the help,
> Neha
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to dynamically add variables to a dataframe

2018-04-22 Thread Eric Berger

Hi Luca,
How about this?

# create some dummy data since I don't have your d0 or d1
> n  <- 3
> d0 <- data.frame(a=runif(5),b=runif(5))

# here's the suggested code
> d1 <- cbind(d0, matrix(0,nrow(d0),n))
> colnames(d1)[1:n + ncol(d0)] <- paste("V",1:n,sep="")

HTH,
Eric


On Sun, Apr 22, 2018 at 11:13 AM, Luca Meyer  wrote:

> Hi,
>
> I am a bit rusty with R programming and do not seem to find a solution to
> add a number of variables to my existing dataframe. Basically I need to add
> n=dim(d1)[1] variables to my d0 dataframe and I would like them to be named
> V1, V2, V3, ... , V[dim(d1)[1])
>
> When running the following code:
>
> for (t in 1:dim(d1)[1]){
>   d0$V[t] <- 0
> }
>
> all I get is a V variable populated with zeros...
>
> I am sure there is a fairly straightforward code to accomplish what I need,
> any suggestion?
>
> Thank you,
>
> Luca
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rolling window difference for zoo time series

2018-04-24 Thread Eric Berger

Zoo_TS/lag(Zoo_TS) - 1


On Tue, Apr 24, 2018 at 9:29 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> Hi,
>
> I have a 'zoo' time series as below :
>
> Zoo_TS = zoo(5:1, as.Date(Sys.time())+0:4)
>
> Now I want to calculate First order difference of order 1, rolling
> window basis i.e.
>
> (Zoo_TS[2] - Zoo_TS[1] ) / Zoo_TS[1]
> (Zoo_TS[3] - Zoo_TS[2] ) / Zoo_TS[2]
> .
>
> Is there any direct function available to achieve this?
>
> Thanks,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to define mutualy exclusive parameters of a function

2018-04-26 Thread Eric Berger

Hi Pol,
Here is one way:

fb <- function(mean=NULL, median=NULL, mode=NULL, a, b=0.95, lower=F) {

stopifnot ( (is.null(mean) + is.null(median) + is.null(mode)) == 2 )

etc...

}


HTH,
Eric


On Thu, Apr 26, 2018 at 4:46 PM, Polychronis Kostoulas <
polychronis.kostou...@gmail.com> wrote:

> Dear All,
> apologies if this is basic: I am writing a function:
>
> fb<-function(mean, median, mode, a, b=0.95, lower=F)
> {}
>
> The arguments mean, median and mode are mutually exclusive (i.e. the user
> should define only one of these). How do I code this within the function?
>
> Thanks,
> Pol
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding overall constraint in optim()

2018-05-06 Thread Eric Berger

Hi Michael,
A few comments
1. To add the constraint sum(wgt.vect=1) you would use the method of
Lagrange multipliers.
What this means is that in addition to the w_i (the components of the
weight variables) you would add an additional variable, call it lambda.
Then you would modify your optim.fun() function to add the term
 lambda * (sum(wgt.vect - 1)
2. Are you sure that you have defined Mo.vect correctly? It is a scalar the
way you have written it.
3. Similarly your definition of wgt.vect creates a scalar.

HTH,
Eric


On Fri, May 4, 2018 at 5:18 AM, Joshua Ulrich 
wrote:

> On Thu, May 3, 2018 at 2:03 PM, Michael Ashton
>  wrote:
> > Thanks Bert. But everyone on that forum wants to use finance tools
> rather than general optimization stuff! And I am not optimizing a
> traditional Markowitz mean-variance problem. Plus, smarter people here. :-)
> >
> I'm very confused by these statements.  Most of the "finance tools"
> use general-purpose global and/or stochastic optimization packages
> (e.g. rugarch uses nloptr and Rsolnp, PortfolioAnalytics uses DEoptim,
> pso, GenSA).  And most (all?) of those optimization packages have ways
> to specify box, equality, and nonlinear inequality constraints.
>
> And I can't recall the last time someone emailed the list about
> optimizing a traditional Markowitz mean-variance problem... maybe 10
> years ago?
>
> >> On May 3, 2018, at 3:01 PM, Bert Gunter  wrote:
> >>
> >> You can't -- at least as I  read the docs for ?optim (but I'm pretty
> >> ignorant about this, so maybe there's a way to tweak it so you can).
> >>
> >> See here:   https://cran.r-project.org/web/views/Optimization.html
> >> for other R optimization capabilities.
> >>
> >> Also,  given your credentials, the r-sig-finance list might be a
> >> better place for you to post your query.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Thu, May 3, 2018 at 10:52 AM, Michael Ashton
> >>  wrote:
> >>> Hi –
> >>>
> >>> This is giving me a headache. I’m trying to do a relatively simple
> optimization – actually trying to approximate the output from the Excel
> Solver function but at roughly 1000x the speed. 😊
> >>>
> >>> The optimization parameters look like this. The only trouble is that I
> want to add a constraint that sum(wgt.vect)=1, and I can’t figure out how
> to do that in optim.
> >>>
> >>> Mo.vect <- as.vector(tail(head(mo,i),1))
> >>> wgt.vect <- as.vector(tail(head(moWeightsMax,i),1))
> >>> cov.mat <- cov(tail(head(morets,i+12),12))
> >>> opt.fun <- function(wgt.vect) -sum(Mo.vect %*% wgt.vect) /
> (t(wgt.vect) %*% (cov.mat %*% wgt.vect))
> >>>
> >>> LowerBounds<-c(0.2,0.05,0.1,0,0,0)
> >>> UpperBounds<-c(0.6,0.3,0.6,0.15,0.1,0.2)
> >>>
> >>> OptimSolution<-optim(wgt.vect, fn=opt.fun, method="L-BFGS-B",lower=
> LowerBounds,upper=UpperBounds)
> >>>
> >>>
> >>> Any thoughts are appreciated!
> >>>
> >>> Mike
> >>>
> >>> Michael Ashton, CFA
> >>> Managing Principal
> >>>
> >>> Enduring Investments LLC
> >>> W: 973.457.4602
> >>> C: 551.655.8006
> >>>
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Ulrich  |  about.me/joshuaulrich
> FOSS Trading  |  www.fosstrading.com
> R/Finance 2018 | www.rinfinance.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding overall constraint in optim()

2018-05-06 Thread Eric Berger

typo:  lambda * (sum(wgt.wect) - 1)

On Sun, May 6, 2018 at 10:51 AM, Eric Berger  wrote:

> Hi Michael,
> A few comments
> 1. To add the constraint sum(wgt.vect=1) you would use the method of
> Lagrange multipliers.
> What this means is that in addition to the w_i (the components of the
> weight variables) you would add an additional variable, call it lambda.
> Then you would modify your optim.fun() function to add the term
>  lambda * (sum(wgt.vect - 1)
> 2. Are you sure that you have defined Mo.vect correctly? It is a scalar
> the way you have written it.
> 3. Similarly your definition of wgt.vect creates a scalar.
>
> HTH,
> Eric
>
>
> On Fri, May 4, 2018 at 5:18 AM, Joshua Ulrich 
> wrote:
>
>> On Thu, May 3, 2018 at 2:03 PM, Michael Ashton
>>  wrote:
>> > Thanks Bert. But everyone on that forum wants to use finance tools
>> rather than general optimization stuff! And I am not optimizing a
>> traditional Markowitz mean-variance problem. Plus, smarter people here. :-)
>> >
>> I'm very confused by these statements.  Most of the "finance tools"
>> use general-purpose global and/or stochastic optimization packages
>> (e.g. rugarch uses nloptr and Rsolnp, PortfolioAnalytics uses DEoptim,
>> pso, GenSA).  And most (all?) of those optimization packages have ways
>> to specify box, equality, and nonlinear inequality constraints.
>>
>> And I can't recall the last time someone emailed the list about
>> optimizing a traditional Markowitz mean-variance problem... maybe 10
>> years ago?
>>
>> >> On May 3, 2018, at 3:01 PM, Bert Gunter 
>> wrote:
>> >>
>> >> You can't -- at least as I  read the docs for ?optim (but I'm pretty
>> >> ignorant about this, so maybe there's a way to tweak it so you can).
>> >>
>> >> See here:   https://cran.r-project.org/web/views/Optimization.html
>> >> for other R optimization capabilities.
>> >>
>> >> Also,  given your credentials, the r-sig-finance list might be a
>> >> better place for you to post your query.
>> >>
>> >> Cheers,
>> >> Bert
>> >>
>> >>
>> >> Bert Gunter
>> >>
>> >> "The trouble with having an open mind is that people keep coming along
>> >> and sticking things into it."
>> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >>
>> >>
>> >> On Thu, May 3, 2018 at 10:52 AM, Michael Ashton
>> >>  wrote:
>> >>> Hi –
>> >>>
>> >>> This is giving me a headache. I’m trying to do a relatively simple
>> optimization – actually trying to approximate the output from the Excel
>> Solver function but at roughly 1000x the speed. 😊
>> >>>
>> >>> The optimization parameters look like this. The only trouble is that
>> I want to add a constraint that sum(wgt.vect)=1, and I can’t figure out how
>> to do that in optim.
>> >>>
>> >>> Mo.vect <- as.vector(tail(head(mo,i),1))
>> >>> wgt.vect <- as.vector(tail(head(moWeightsMax,i),1))
>> >>> cov.mat <- cov(tail(head(morets,i+12),12))
>> >>> opt.fun <- function(wgt.vect) -sum(Mo.vect %*% wgt.vect) /
>> (t(wgt.vect) %*% (cov.mat %*% wgt.vect))
>> >>>
>> >>> LowerBounds<-c(0.2,0.05,0.1,0,0,0)
>> >>> UpperBounds<-c(0.6,0.3,0.6,0.15,0.1,0.2)
>> >>>
>> >>> OptimSolution<-optim(wgt.vect, fn=opt.fun,
>> method="L-BFGS-B",lower=LowerBounds,upper=UpperBounds)
>> >>>
>> >>>
>> >>> Any thoughts are appreciated!
>> >>>
>> >>> Mike
>> >>>
>> >>> Michael Ashton, CFA
>> >>> Managing Principal
>> >>>
>> >>> Enduring Investments LLC
>> >>> W: 973.457.4602
>> >>> C: 551.655.8006
>> >>>
>> >>>
>> >>>[[alternative HTML version deleted]]
>> >>>
>> >>> __
>> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>&g

Re: [R] a question about R script : "Can only modify plain character vectors."

2018-05-08 Thread Eric Berger

Can you create a small script that reproduces the problem?
If you can, then please post it to the mailing list.

On Tue, May 8, 2018 at 4:24 PM, Bogdan Tanasa  wrote:

> Dear all,
>
> would appreciate a suggestion about the following situation : I am running
> a script in R, and shall i execute it in the terminal, step by step, it
> works fine.
>
> however, if  i do source ("script.R"), it does not complete and I am
> getting the error :
> "Can only modify plain character vectors."
>
> what may go wrong ? thank you for your help,
>
> -- bogdan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Year-Month-Day to X axis

2018-05-08 Thread Eric Berger

abline  (v=x_mmdd, lty=3, lwd=1.0, col="blue")


On Tue, May 8, 2018 at 5:23 PM, Gregory Coats  wrote:

> Since the horizontal axis side=1 is year-month-day, how do I issue an
> abline command to draw dashed vertical lines, as a background grid, within
> the graph’s border? Similar to the abline command I call below, in blue,
> for dashed horizontal lines, as a background grid.
> Greg
>
> y_duration <- c (301.59050, 387.35700, 365.64366, 317.26150, 321.71883,
> 342.44950, 318.95350, 322.33233, 330.60333, 428.99516, 297.82066,
> 258.23166, 282.01816)
> x_mmdd <-as.Date(c ("2018-04-25", "2018-04-26", "2018-04-27",
> "2018-04-28", "2018-04-29", "2018-04-30", "2018-05-01", "2018-05-02",
> "2018-05-03", "2018-05-04", "2018-05-05", "2018-05-06", "2018-05-07"),
> format="%Y-%m-%d")
> par (mar=c(6,4,4,2))
> plot(x_mmdd, y_duration, type="l", xaxt="n", yaxt="n",
> ylim=range(240,480), xlab="", ylab="", col="blue")
> abline  (h=c(240,270,300,330,360,390,420,450,480,510,540), lty=3,
> lwd=1.0, col="grey50")
> axis(side=2, at=240, cex.axis=1.0, label="4:00")
> axis(side=2, at=300, cex.axis=1.0, label="5:00")
> axis(side=2, at=360, cex.axis=1.0, label="6:00")
> axis(side=2, at=420, cex.axis=1.0, label="7:00")
> axis(side=2, at=480, cex.axis=1.0, label="8:00")
> axis(side=1, at=x_mmdd, labels=format(x_mmdd, "%Y-%m-%d"),
> las=2)
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with json data from the web into data frame in R

2018-05-08 Thread Eric Berger

Hi Rich,
Take a look at the function fromJSON found in the rjson package.
Note that the Usage in the help page: ?fromJSON
names the second argument 'file' but if you look at the description the
argument can be a URL.

HTH,
Eric

On Tue, May 8, 2018 at 6:16 PM, Evans, Richard K. (GRC-H000) <
richard.k.ev...@nasa.gov> wrote:

> Hello
>
> I am able to construct a url that points to some data online in the JSON
> format.  See an example at [0].
>
> I would like to work with this data as a dataframe in R.
>
> I know that there is a package for handling json data [1] but it assumes
> the data is in a local file but It is not clear to me how to request the
> data from the web in an R script and get the json data converted into a
> data frame in R.
>
> Can anyone provide a basic example or some guidance please?
>
> -Rich (revansx)
>
> [0] https://www.semantic-mediawiki.org/w/api.php?
> action=ask&query=[[Category:City]]&format=json
> [1] https://www.tutorialspoint.com/r/r_json_files.htm
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Average of results coming from B=100 repetitions (looping)

2018-05-08 Thread Eric Berger

mean(unlist(lst))


On Tue, May 8, 2018 at 10:26 PM, varin sacha via R-help <
r-help@r-project.org> wrote:

>
>  Dear R-experts,
>
> Here below the reproducible example. I am trying to get the average of the
> 100 results coming from the "lst" function. I have tried lst$mean and
> mean(lst). It does not work.
> Any help would be highly appreciated.
>
> 
>
>  ## R script for getting MedAe and MedAeSQ from HBR model on Testing data
> install.packages("robustbase")
> install.packages( "MASS" )
> install.packages( "quantreg" )
> install.packages( "RobPer")
> install.packages("devtools")
> library("devtools")
> install_github("kloke/hbrfit")
> install.packages('http://www.stat.wmich.edu/mckean/Stat666/
> Pkgs/npsmReg2_0.1.1.tar.gz')
> library(robustbase)
> library(MASS)
> library(quantreg)
> library(RobPer)
> library(hbrfit)
>
> # numeric variables
> A=c(2,3,4,3,2,6,5,6,4,3,5,55,6,5,4,5,6,6,7,52)
> B=c(45,43,23,47,65,21,12,7,18,29,56,45,34,23,12,65,4,34,54,23)
> D=c(21,54,34,12,4,56,74,3,12,71,14,15,63,34,35,23,24,21,69,32)
>
> # Create a dataframe
> BIO<-data.frame(A,B,D)
>
> # Create a list to store the results
> lst<-list()
>
> # This statement does the repetitions (looping)
> for(i in 1 :100)
> {
>
> # randomize sampling seed
> n=dim(BIO)[1]
> p=0.667
>
> # Sample size
> sam=sample(1 :n,floor(p*n),replace=FALSE)
>
> # Sample training data
> Training =BIO [sam,]
>
> # Sample testing data
> Testing = BIO [-sam,]
>
> # Build the HBR model
> HBR<-hbrfit(D ~ A+B)
>
> # Grab the coefficients
> HBR_intercept <- as.numeric(HBR$coefficients[1])
> HBR_coefA <- as.numeric(HBR$coefficients[2])
> HBR_coefB <- as.numeric(HBR$coefficients[3])
>
> # Predict response on testing data
> Testing$pred <- HBR_intercept + HBR_coefA * Testing$A + HBR_coefB
> *Testing$B
>
> # Get errors
> Testing$sq_error <- (Testing$D-Testing$pred)^2
> Testing$abs_error <- abs(Testing$D-Testing$pred)
> MedAe <- median(Testing$abs_error)
> MedAe
> MedAeSQ <-median(Testing$sq_error)
> MedAeSQ
>
> lst[i]<-MedAe
> }
> lst
> mean(lst)
> lst$mean
>
> ##
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Seasonal weekly average

2018-05-09 Thread Eric Berger

Hi Shakeel,
One approach would be to look at the dplyr package and its functions
group_by() and summarise(). These should be useful in preparing the data.
(Alternatively if you know SQL you might look at dbplyr.)
On the plotting side you can use plot(...) for the first line and then
lines(...) for the second line.
Or you can go with the ggplot2 package for the charts but that might
require a bit more time to get up to speed.

Good luck,
Eric

On Wed, May 9, 2018 at 9:37 AM, Shakeel Suleman 
wrote:

> Hi,
>
> I am fairly new to 'R' and would like advice on the following. I want to
> calculate a weekly average number of reports (e.g. of flu, norovirus) based
> on the same weeks for the last five years. I will then use this to plot a
> chart with 52 points for the average based on the last five years; another
> line will then plot the current year, enabling a comparison of current
> weekly counts against a five  year average for the same week. I would like
> some advice on how this can be done in 'R' . My data is disaggregated data
> - with dates in the format in 01/01/2018.
>
> Thanks
>
> Shakeel Suleman
>
>
>
> **
> The information contained in the EMail and any attachments is confidential
> and intended solely and for the attention and use of the named
> addressee(s). It may not be disclosed to any other person without the
> express authority of Public Health England, or the intended recipient, or
> both. If you are not the intended recipient, you must not disclose, copy,
> distribute or retain this message or any part of it. This footnote also
> confirms that this EMail has been swept for computer viruses by
> Symantec.Cloud, but please re-sweep any attachments before opening or
> saving. http://www.gov.uk/PHE
> **
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange behavior of plotmath

2018-05-21 Thread Eric Berger

FYI I see everything after the '^' as a superscript.
The '~' does act as a space. (When I omit it there is less space between
the '-' and the '('.

On Mon, May 21, 2018 at 3:09 PM, Jinsong Zhao  wrote:

> hi there,
>
> I find the following codes produce strange output.
>
> plot(1:10, xlab = expression(NO[3]^-~(mg/L)))
>
> you will notice that the unit, mg/L is in superscript format.
> That means that "~" is not for space.
> However, the help page of plotmath does not mention this behavior.
>
> Best,
> Jinsong
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make the code more efficient using lapply

2018-05-25 Thread Eric Berger

Hi Stephen,
I am not sure that the "for loop" is the source of slowness.
You seem to be doing a lot of unnecessary work each time through the loop.
e.g. no need to check if it's the last file, just move that section outside
of the loop.
It will be executed when the loop finishes. As it is you are calling
list.files() each time
through the loop which could be slow.

In any case here's a possible way to do it. Warning: untested!

f <- function(fn) {
  temp<-read_xlsx(fn,sheet=1,range=cell_cols(c(1,30,38:42)))
  temp<-temp[temp$Id %in% c("geneA","geneB","geneC"),]
}
myL <- lapply( X=list.files(), FUN=f )
temp.df.all<-do.call("rbind",myL)
names(temp.df.all)<-gsub("^.*] ","",names(temp.df.all))
write_xlsx(temp.df.all, path="output.xlsx")

HTH,
Eric




On Fri, May 25, 2018 at 9:24 AM, Stephen HonKit Wong 
wrote:

> Dear All,
>
> I have a following for-loop code which is basically intended to read in
> many excel files (each file has many columns and rows) in a directory and
> extract the some rows and columns out of each file and then combine them
> together into a dataframe. I use for loop which can do the work but quite
> slow. How to make it faster using lapply function ?  Thanks in advance!
>
>
>
> temp.df<-c() # create an empty list to store the extracted result from each
> excel file inside for-loop
>
>
> for (i in list.files()) {  # loop through each excel file in the directory
>
>   temp<-read_xlsx(i,sheet=1,range=cell_cols(c(1,30,38:42)))  # from
> package
> "readxl" to read in excel file
>
>   temp<-temp[grep("^geneA$|^geneB$|^geneC$",temp$Id),]   # extract rows
> based on temp$id
>
>   names(temp)<-gsub("^.*] ","",names(temp)) # clean up column names
>
>   temp.df<-append(temp.df, list(as.data.frame(temp))) # change the
> dataframe to list, so it can be append to list.
>
>   if (i == list.files()[length(list.files())]){ # if it is last excel
> file,
> then combine all the rows in the list into a dataframe because they all
> have same column names
>
> temp.df.all<-do.call("rbind",temp.df)
>
> write_xlsx(temp.df.all, path="output.xlsx")  # write_xlsx from package
> writexl.
>
>   }
>
>   }
>
>
>
>
> *Stephen*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 >

1 - 100 of 553 matches

Mail list logo