[R] Documentation of Trigonometric Functions.

2021-04-14 Thread Jorgen Harmse via R-help
Is correct but incomplete documentation considered a bug? The documentation of 
trigonometric functions goes into detail about branch cuts for asin etc., but 
does not discuss the discontinuities of atan2. (It also fails to explain the 
difference between asin(2) (NaN) and asin(2+0i) (pi/2-acosh(2)i).) Since atan2 
accepts complex arguments, the discontinuities are not just on the half-axis 
with x<=0 and y=0.

I also think that documentation of trigonometric functions should link to 
hyperbolic functions and that documentation of hyperbolic functions should link 
to the exponential function.

Jorgen Harmse.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assumptions about how things are done

2021-10-11 Thread Jorgen Harmse via R-help
As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.

Regards,
Jorgen Harmse. 



Message: 2
Date: Sat, 9 Oct 2021 15:35:55 -0400
From: "Avi Gross" 
To: 
Subject: [R] assumptions about how things are done
Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
Content-Type: text/plain; charset="utf-8"

This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.



Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.



My NA�VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:



result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)



So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:



primed <- ifelse(is.prime(A, A, NA)



So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.



If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.



What I found ifelse() as implemented to do, is sort of like this:



Make a vector of the right length for the results, initially empty.



Make a vector evaluating the condition so it is effectively a Boolean
result.

Calculate which indices are TRUE. Secondarily, calculate another set of
indices that are false.



Calculate ALL the THEN conditions and ditto all the ELSE conditions.



Now copy into the result all the THEN values indexed by the TRUE above and
than all the ELSE values indicated by the FALSE above.



In plain English, make a result from two other results based on picking
either one from menu A or one from menu B.



That is not a bad algorithm and in a vectorized language like R, maybe even
quite effective and efficient. It does lots of extra work as by definition
it throws at least half away.



I suspect the implementation could be made much faster by making some of it
done internally using a language like C.



But now that I know what this implementation did, I might have some qualms
at using it in some situations. The original complaint led to other
observations and needs and perhaps blindly using a supplied function like
ifelse() may not be a decent solution for some needs.



I note how I had to reorient my work elsewhere using a group of packages
called the tidyverse when they added a function to allow rowwise
manipulation of the data as compared to an ifelse-like method using all
columns at once. There is room for many approaches and if a function may not
be doing quite what you want, something else may better meet your needs OR
you may want to see if you can copy the existing function and modify it for
your own personal needs.



In the case we mentioned, the goal was to avoid printing selected warnings.
Since the function is readable, it can easily be modified in a copy to find
what is causing the warnings and either rewrite a bit to avoid them or start
over with perhaps your own function that tests before doing things and
avoids tripping the condition (generating a NaN) entirely.



Like may languages, R is a bit too rich. You can piggyback on the work of
others but with some caution as they did not necessarily have you in mind
with what they created.






[[alternative HTML version deleted]]





--

Message: 4
Date: Sun, 10 Oct 2021 08:34:52 +1100
From: Jim Lemon 
To: Avi Gross 
Cc: r-help mailing list 
Subject: Re: [R

Re: [R] names.data.frame?

2021-11-04 Thread Jorgen Harmse via R-help
Can someone please explain what Leonard Mada is trying to do? As far as I know, 
names is not generic and there is no names.data.frame because it’s not needed. 
(A data.frame seems to be just a named list with some extra functionality that 
depends on every element being a vector with the same length and some 
overloading of list functions to ensure that that is always true.) The other 
answers confused me more.

Regards,
Jorgen Harmse.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in list.files(full.names=T)

2021-12-20 Thread Jorgen Harmse via R-help
There is a possibly related problem in file.path. As Murdoch says, it's ugly 
even if the OS accepts it, and I don't see that the base version is any better 
than paste(sep=fsep, ...). Pasting the result into emacs wouldn't work. I wrote 
my own version to remove trailing fsep from all but the last argument.

> base::file.path('foo/','bar')
[1] "foo//bar"
> file.path('foo/','bar')
[1] "foo/bar"

Incidentally, I don't like T & F for Booleans (or t for transpose) in 
production code. Single letters are too useful for local variables, so I would 
say TRUE, FALSE, & base::t.

Jorgen Harmse.



Message: 1
Date: Sat, 18 Dec 2021 15:55:37 +0100
From: Mario Reutter 
To: r-help@r-project.org
Subject: [R] Bug in list.files(full.names=T)
Message-ID:

Content-Type: text/plain; charset="utf-8"

Dear everybody,

I'm a researcher in the field of psychology and a passionate R user. After
having updated to the newest version, I experienced a problem with
list.files() if the parameter full.names is set to TRUE.
A path separator "/" is now always appended to path in the output even if
path %>% endsWith("/"). This breaks backwards compatibility in case path
ends with a path separator. The problem occurred somewhere between R
version 3.6.1 (2019-07-05) and 4.1.2 (2021-11-01).

Example:
>> list.files("C:/Data/", full.names=T)
C:/Data//file.csv

Expected behavior:
Either a path separator should never be appended in accordance with
the documentation: "full.names
a logical value. If TRUE, the directory path is prepended to the file names
to give a relative file path."
Or it could only be appended if path doesn't already end with a path
separator.

My question would now be if this warrants a bug report? And if you agree,
could someone issue the report since I'm not a member on Bugzilla?

Thank you and best regards,
Mario Reutter

[[alternative HTML version deleted]]




--



Message: 3
Date: Sun, 19 Dec 2021 07:24:06 -0500
From: Duncan Murdoch 
To: Mario Reutter , r-help@r-project.org
Subject: Re: [R] Bug in list.files(full.names=T)
Message-ID: <67096ee7-054d-0e89-cc44-6ca702307...@gmail.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

I don't know the answer to your question, but I see the same behaviour 
on MacOS, e.g. list.files("./") includes ".//R" in the results on my 
system.  Both "./R" and ".//R" are legal ways to express that path on 
MacOS, so it's not a serious bug, but it does look ugly.

Duncan Murdoch

...


--

Message: 10
Date: Mon, 20 Dec 2021 09:46:23 +0100
From: Martin Maechler 
To: Mario Reutter 
Cc: 
Subject: Re: [R] Bug in list.files(full.names=T)
Message-ID: <25024.17119.584361.442...@stat.math.ethz.ch>
Content-Type: text/plain; charset="us-ascii"

...

This expected behavior has never been documented AFAIK.
I tried R 3.6.1 on Linux  and it already behaved like that:  If
you append a path separator  it is kept in addition to the new
one even though it's not needed:

> list.files("/tmp", "^[abc]", full.names = TRUE)
[1] "/tmp/check_proc__localuser"

> list.files("/tmp/", "^[abc]", full.names = TRUE)
[1] "/tmp//check_proc__localuser"

Why would one ever *add* a final unneeded path separator, unless
one wanted it?

Note that the default is  ".",  not "./"  ..

I think the change you see made R-for-Windows compatible
to the rest of the univeRse where list.files() aka dir() always
behaved like this.

I agree that ideally this would have been mentioned in some of
the NEWS; maybe it *is* mentioned in the rw-faw (R for Windows
FAQ) or other R for Windows news.. ?


> My question would now be if this warrants a bug report?

I don't think so.
As I'm saying above, I think this has rather been a bug fix,
making R more universal / less platform-dependent.

Last but not least: You'd ideally update more than every 2.5 years...

Best,
Martin Maechler




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How important is set.seed

2022-03-22 Thread Jorgen Harmse via R-help
Jeff Newmiller makes an interesting point about distributed processing, but I 
don�t know how to use the usual pseudo-random processes to obtain deterministic 
results when I don�t know how the data will be sharded. You might have to 
replace pseudo-random sampling with deterministic sampling using a hash of 
something involving the unique key. Then the selection of a salt is the 
equivalent of a call to set.seed in non-parallel processing. The results should 
be the same as long as you fix the data set & the salt, and then you can test 
sensitivity to changes in the salt.
Jorgen Harmse


From: Neha gupta 
To: "Ebert,Timothy Aaron" 
Cc: Jeff Newmiller , "r-help@r-project.org"

Subject: Re: [R] How important is set.seed
Message-ID:

Content-Type: text/plain; charset="utf-8"

Thank you all.

Actually I need set.seed because I have to evaluate the consistency of
features selection generated by different models, so I think for this, it's
recommended to use the seed.

Warm regards

On Tuesday, March 22, 2022, Ebert,Timothy Aaron  wrote:

> If you are using the program for data analysis then set.seed() is not
> necessary unless you are developing a reproducible example. In a standard
> analysis it is mostly counter-productive because one should then ask if
> your presented results are an artifact of a specific seed that you selected
> to get a particular result. However, in cases where you need a reproducible
> example, debugging a program, or specific other cases where you might need
> the same result with every run of the program then set.seed() is an
> essential tool.
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Jeff Newmiller
> Sent: Monday, March 21, 2022 8:41 PM
> To: r-help@r-project.org; Neha gupta ; r-help
> mailing list 
> Subject: Re: [R] How important is set.seed
>
> [External Email]
>
> First off, "ML models" do not all use random numbers (for prediction I
> would guess very few of them do). Learn and pay attention to what the
> functions you are using do.
>
> Second, if you use random numbers properly and understand the precision
> that your specific use case offers, then you don't need to use set.seed.
> However, in practice, using set.seed can allow you to temporarily avoid
> chasing precision gremlins, or set up specific test cases for testing code,
> not results. It is your responsibility to not let this become a crutch... a
> randomized simulation that is actually sensitive to the seed is unlikely to
> offer an accurate result.
>
> Where to put set.seed depends a lot on how you are performing your
> simulations. In general each process should set it once uniquely at the
> beginning, and if you use parallel processing then use the features of your
> parallel processing framework to insure that this happens. Beware of
> setting all worker processes to use the same seed.
>
> On March 21, 2022 5:03:30 PM PDT, Neha gupta 
> wrote:
> >Hello everyone
> >
> >I want to know
> >
> >(1) In which cases, we need to use set.seed while building ML models?
> >
> >(2) Which is the exact location we need to put the set.seed function i.e.
> >when we split data into train/test sets, or just before we train a model?
> >
> >Thank you
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailm
> >an_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRz
> >sn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrmf
> >0UaX&s=5b117E3OFSf5VyLOctfnrz0rj5B2WyRxpXsq4Y3TRMU&e=
> >PLEASE do read the posting guide
> >https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org
> >_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsR
> >zsn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrm
> >f0UaX&s=wI6SycC_C2fno2VfxGg9ObD3Dd1qh6vn56pIvmCcobg&e=
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
> ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=
> 9PEhQh2kVeAsRzsn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_
> AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrmf0UaX&s=5b117E3OFSf5VyLOctfnrz0rj5B2Wy
> RxpXsq4Y3TRMU&e=
> PLEASE do read the posting guide https://urldefense.proofpoint.
> com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.
> html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=
> s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcL
> wt2jrmf0UaX&s=wI6SycC_C2fno2VfxGg9ObD3Dd1qh6vn56pIvmCcobg&e=
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]




--

Subject: Di

Re: [R] Question about Line Ending Choice

2022-09-28 Thread Jorgen Harmse via R-help
eol seems to be the parameter to use, but the answers so far appear to assume 
that the file is created on a Mac. For example, I think that �\r\n� on Windows 
would produce CR CR LF. I don�t have both systems handy (so I can�t test), but 
I think you should use raw to specify the bytes you want.

# I think the following are independent of the OS on which you are writing the 
file.
CR <- rawToChar(as.raw(13))
LF <- rawToChar(as.raw(10))
if missing(target)
  # Hope that it matches the machine on which you are writing the file.
  eol <- �\n�
else if (target==�Windows�)
  eol <- c(CR,LF)
else if (target %in% c(�Unix�,�Mac�))
  eol <- LF
else if �.
else
  stop(�Unexpected target.�)

write.table(eol=eol, �.)

Regards,
Jorgen Harmse.



Message: 7
Date: Tue, 27 Sep 2022 11:35:54 -0400
From: "Stephen H. Dawson, DSL" 
To: Bert Gunter 
Cc: r-help 
Subject: Re: [R] Question about Line Ending Choice
Message-ID: <04e458aa-e5f5-c932-da3c-1aa35db7d...@shdawson.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Hi Bert,


Thanks for the reply.

I did see the parameter, but was not sure if this is the correct
parameter to reference. I also see it in write.csv.

I take it you are saying the eol parameter is the best practice for
exporting from R using these functions. Am I correct or is there another
option other than write.csv and write.table I should be considering?


Thanks,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com


On 9/27/22 11:29, Bert Gunter wrote:
> Did you not see the "eol" parameter in write.table ?
>
> Bert
>
> On Tue, Sep 27, 2022 at 8:23 AM Stephen H. Dawson, DSL via R-help
>  wrote:
>
> Hi All,
>
>
> I am writing with a question about choosing the line ending aspect
> of a
> file, please.
>
> I use write.csv and write.table to export work to CSV files and TXT
> files. I am planning now on how to share my work with the Windows
> crowd
> beyond only sharing with the Linux crowd. I use my text editor to
> flip
> the line ending option from Linux to Windows after exporting. This is
> inefficient for me to accomplish if I ramp up production as I expect
> will occur.
>
> Staying with the character encoding of UTF-8 seems fine for now from
> what I understand I need to deliver to my customers.
>
> What seems more efficient to me is to learn how to use R to define
> the
> line ending aspect of the exported file. I have not found if this
> is an
> option within R.
>
> QUESTION
> Is it possible within R to define the line ending aspect of file
> output?
>
>
> Kindest Regards,
> --
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> 
> and provide commented, minimal, self-contained, reproducible code.
>



**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected 'else' in " else"

2022-10-21 Thread Jorgen Harmse via R-help
Andrew Simmons is correct but doesn't explain why the code works in the 
package. This is one of only two differences I have found between running code 
at the command line and running it from a file. (The other difference is that 
code in a file is often executed in an environment other than .GlobalEnv. There 
is some related sugar around packages that usually makes things work the way a 
user would want.) At the command line, R executes code whenever RETURN could be 
interpreted as the end of a statement. "If(�.) �. RETURN" is ambiguous: will it 
be followed by "else", or is it a complete statement? If it's in a file or 
wrapped in a block or other structure that obviously hasn't ended yet then R 
will wait to see the next line of input, but if it could be a complete 
statement then not executing it would cause a lot of frustration for users. 
Once the statement is executed, R expects another statement, and no statement 
begins with "else". (Maybe the interpreter could be enhanced to keep the "if" 
open under some conditions, but I haven't thought it through. In particular, 
"if" without "else" is NULL if the condition is FALSE, so it might be necessary 
to undo an assignment, and that seems very difficult.)

Regards,
Jorgen Harmse.


On Fri., Oct. 21, 2022, 05:29 Jinsong Zhao,  wrote:

> Hi there,
>
> The following code would cause R error:
>
>  > w <- 1:5
>  > r <- 1:5
>  > if (is.matrix(r))
> + r[w != 0, , drop = FALSE]
>  > else r[w != 0]
> Error: unexpected 'else' in "else"
>
> However, the code:
>  if (is.matrix(r))
>  r[w != 0, , drop = FALSE]
>  else r[w != 0]
> is extracted from stats::weighted.residuals.
>
> My question is why the code in the function does not cause error?
>
> Best,
> Jinsong
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]




--

Subject: Digest Footer

___
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

End of R-help Digest, Vol 236, Issue 19
***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: unexpected 'else' in " else"

2022-10-21 Thread Jorgen Harmse via R-help
Thank you. I knew it had nothing to do with the choice of environment, but I 
thought I had seen such unwrapped code working in files in a previous version. 
Maybe I misremembered. Incidentally, there is nothing special about braces: 
anything that makes the statement incomplete will do.

Regards,
Jorgen.


> evalq( if (FALSE)

  cat("This shouldn't happen.\n")

else

  cat("Everything is fine.\n"),

+ .GlobalEnv

+ )

Everything is fine.

> x <- 1:5

> x[ if(FALSE) 1L

+else 2L

+  ]

[1] 2


From: Andrew Simmons 
Date: Friday, 21October, 2022 at 11:20
To: Jorgen Harmse 
Cc: r-help@r-project.org 
Subject: [EXTERNAL] Re: [R] unexpected 'else' in " else"
The code working inside stats::weighted.residuals has nothing to do
with being evaluated in a different environment than globalenv() and
has nothing to do with being inside a package.
The reason the code works inside stats::weighted.residuals is because
the function body is wrapped with braces. You can try it yourself:

local({
FILE <- tempfile(fileext = ".R")
on.exit(unlink(FILE, force = TRUE, expand = FALSE), add = TRUE,
after = FALSE)
writeLines("if (TRUE) \n'evaluating cons.expr'\nelse
'evaluating alt.expr'", FILE)
writeLines(readLines(FILE))
try(source(FILE, local = TRUE, echo = TRUE, verbose = FALSE))
})

If you try entering it as a function, it still fails:

local({
FILE <- tempfile(fileext = ".R")
on.exit(unlink(FILE, force = TRUE, expand = FALSE), add = TRUE,
after = FALSE)
writeLines("function () \nif (TRUE) \n'evaluating
cons.expr'\nelse 'evaluating alt.expr'", FILE)
writeLines(readLines(FILE))
try(source(FILE, local = TRUE, echo = TRUE, verbose = FALSE))
})

But R packages use sys.source() instead of source() to run R code, but
it still fails if you run it:

local({
FILE <- tempfile(fileext = ".R")
on.exit(unlink(FILE, force = TRUE, expand = FALSE), add = TRUE,
after = FALSE)
writeLines("if (TRUE) \n'evaluating cons.expr'\nelse
'evaluating alt.expr'", FILE)
writeLines(readLines(FILE))
try(sys.source(FILE, envir = environment()))
})

The part that matters is that the function body is wrapped with
braces. `if` statements inside braces or parenthesis (or possibly
brackets) will continue looking for `else` even after `cons.expr` and
a newline has been fully parsed, but will not otherwise.

On Fri, Oct 21, 2022 at 10:39 AM Jorgen Harmse via R-help
 wrote:
>
> Andrew Simmons is correct but doesn't explain why the code works in the 
> package. This is one of only two differences I have found between running 
> code at the command line and running it from a file. (The other difference is 
> that code in a file is often executed in an environment other than 
> .GlobalEnv. There is some related sugar around packages that usually makes 
> things work the way a user would want.) At the command line, R executes code 
> whenever RETURN could be interpreted as the end of a statement. "If(�.) �. 
> RETURN" is ambiguous: will it be followed by "else", or is it a complete 
> statement? If it's in a file or wrapped in a block or other structure that 
> obviously hasn't ended yet then R will wait to see the next line of input, 
> but if it could be a complete statement then not executing it would cause a 
> lot of frustration for users. Once the statement is executed, R expects 
> another statement, and no statement begins with "else". (Maybe the 
> interpreter could be enhanced to keep the "if" open under some conditions, 
> but I haven't thought it through. In particular, "if" without "else" is NULL 
> if the condition is FALSE, so it might be necessary to undo an assignment, 
> and that seems very difficult.)
>
> Regards,
> Jorgen Harmse.
>
>
> On Fri., Oct. 21, 2022, 05:29 Jinsong Zhao,  wrote:
>
> > Hi there,
> >
> > The following code would cause R error:
> >
> >  > w <- 1:5
> >  > r <- 1:5
> >  > if (is.matrix(r))
> > + r[w != 0, , drop = FALSE]
> >  > else r[w != 0]
> > Error: unexpected 'else' in "else"
> >
> > However, the code:
> >  if (is.matrix(r))
> >  r[w != 0, , drop = FALSE]
> >  else r[w != 0]
> > is extracted from stats::weighted.residuals.
> >
> > My question is why the code in the function does not cause error?
> >
> > Best,
> > Jinsong
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch

Re: [R] unexpected 'else' in " else" (Ebert,Timothy Aaron)

2022-10-24 Thread Jorgen Harmse via R-help
There were several interesting points about `ifelse`. The usual behaviour seems 
to be that all three inputs are evaluated, and the entries of `yes` 
corresponding to `TRUE` in `test` are combined with the entries of `no` 
corresponding to `FALSE` in `test`. Moreover, `yes` & `no` seem to be recycled 
as necessary in case `test` is longer. On top of that, there seems to be some 
sugar that suppresses evaluations in case `all(test)` and/or `all(!test)`, and 
the return type can be `logical` even if `yes` & `no` are not. I agreed with 
the other responses already, but my experiments further confirmed that `ifelse` 
is not interchangeable with `if()  else `.



The documentation confirms most of this, but 'same length and attributes 
(including dimensions and �"class"�) as �test�' looks wrong. The output seems 
to be `logical` or something related to the classes of `yes` & `no`.



Regards,

Jorgen Harmse.



> ifelse(FALSE, {cat("Evaluating the vector for 'if'.\n"); 1:3}, 
> {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'else'.

[1] 0

> ifelse(rep(FALSE,5L), {cat("Evaluating the vector for 'if'.\n"); 1:3}, 
> {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'else'.

[1] 0 1 2 3 4

> ifelse(rep(TRUE,3L), {cat("Evaluating the vector for 'if'.\n"); 1:3}, 
> {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'if'.

[1] 1 2 3

> ifelse(c(TRUE,TRUE,FALSE), {cat("Evaluating the vector for 'if'.\n"); 1:3}, 
> {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'if'.

Evaluating the vector for 'else'.

[1] 1 2 2

> ifelse(c(TRUE,TRUE,FALSE,TRUE), {cat("Evaluating the vector for 'if'.\n"); 
> 1:3}, {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'if'.

Evaluating the vector for 'else'.

[1] 1 2 2 1

> args(ifelse)

function (test, yes, no)

NULL

> ifelse(c(TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE), {cat("Evaluating the vector 
> for 'if'.\n"); 1:3}, {cat("Evaluating the vector for 'else'.\n"); 0:4})

Evaluating the vector for 'if'.

Evaluating the vector for 'else'.

[1] 1 2 2 1 2 0 1

> ifelse(logical(0L), {cat("Evaluating the vector for 'if'.\n"); 1:3}, 
> {cat("Evaluating the vector for 'else'.\n"); 0:4})

logical(0)

> ifelse(TRUE, integer(0L), numeric(0L))

[1] NA

> class(ifelse(TRUE, integer(0L), numeric(0L)))

[1] "integer"

> ifelse(integer(0L)) # test is an empty vector of integers and yes & no are 
> missing.

logical(0)








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: unexpected 'else' in " else" (Ebert, Timothy Aaron)

2022-10-24 Thread Jorgen Harmse via R-help
I agree that the documentation should be clarified. Moreover, my last example 
shows that the class can be different even when no mode coercion is required. I 
don't know enough about S3 & S4 to comment on your last point.

Regards,
Jorgen Harmse.


From: Bert Gunter 
Date: Monday, 24October, 2022 at 11:31
To: Jorgen Harmse 
Cc: r-help@r-project.org 
Subject: [EXTERNAL] Re: [R] unexpected 'else' in " else" (Ebert,Timothy Aaron)
...

So it would appear that the ifelse() documentation needs to be
clarified. For example, if the above asterisked phrase were "The S3
*class* of the answer will be inferred from the mode, where the mode
of the answer will be coerced ..." that might resolve at least that
bit of confusion However, that might also be incorrect -- what about
S4 vs S3 vs Reference classes, for example (are such cases even
possible?)? I leave resolution of these matters -- or at least their
accurate and complete documentation -- to wiser heads.

Cheers,
Bert

...

> > ifelse(integer(0L)) # test is an empty vector of integers and yes & no are 
> > missing.
>
> logical(0)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: unexpected 'else' in " else" (Ebert, Timothy Aaron)

2022-10-25 Thread Jorgen Harmse via R-help
Good catch! I also misread it, and I think most people would. If I wanted to 
write confusing documentation then I could play similar games with 'mode' and 
'length'.

Regards,
Jorgen Harmse.



> test <- c(TRUE,FALSE,FALSE)

> attr(test,'class') <- 'foo' # probably a bad idea, but I want to see what 
> will happen

> z <- ifelse(test, 1:3, 7:9)

> attr(z,'class')

[1] "foo"

> attr(z,'mode')

NULL

> attr(z,'length')

NULL




From: Bert Gunter 
Date: Monday, 24October, 2022 at 12:07
To: Jorgen Harmse 
Cc: r-help@r-project.org 
Subject: [EXTERNAL] Re: [R] unexpected 'else' in " else" (Ebert,Timothy Aaron)
I wanted to follow up.
A more careful reading of the following:
"A vector of the same length and attributes (including dimensions and
"class") as test..."

So the above **refers only to a "class" attribute that appears among
the attributes of test and result**. Using my previous example, note
that:

 z <- c(TRUE,TRUE,FALSE)
> attributes(z)
NULL ## so no 'class' among attributes(z)
## However
> class(z)  ## S3 class
[1] "logical"
## Similarly
> w <- ifelse(z,5,'a')
> attributes(w)
NULL ## so no 'class' among attributes(w)
> class(w)   ##S3 class
[1] "character"

So my (anyway) confusion stems from conflating the S3 'class' of the
object with a "class" attribute, of which there is none.

Nevertheless, I believe that the phrase I suggested (or something
along those lines) might clarify how the S3 class is determined and
perhaps better distinguish it from a "class" attribute among the
attributes, if there there is such. Or maybe that part of the doc
should just be removed.

My guess is that this documentation has been around for a long time
and no one has gotten around to revising it once S3 classes came into
wider use. ... or saw the need to revise it, anyway.

-- Bert




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected 'else' in " else"

2022-10-28 Thread Jorgen Harmse via R-help
Richard O'Keefe's remarks on the workings of the interpreter are correct, but 
the code examples are ugly and hard to read. (On the other hand, anyone who has 
used the debugger may be de-sensitised to horrible code formatting.) The use of 
whitespace should if possible reflect the structure of the code, and I would 
usually rather throw in a few extra delimiters than obscure the structure.

Regards,
Jorgen Harmse.


Examples (best viewed in a real text editor so things line up):


{ if (x
To: Jinsong Zhao 
Cc: "r-help@r-project.org" 
Subject: Re: [R] unexpected 'else' in " else"
Message-ID:

Content-Type: text/plain; charset="utf-8"

...

The basic issue is that the top level wants to get started
on your command AS SOON AS IT HAS A COMPLETE COMMAND,
and if (...) stmt
is complete.  It's not going to hang around "Waiting for Godot"
for an 'else' that might never ever ever turn up.  So
   if (x < y) z <-
   x else z <- y
is absolutely fine, no braces needed, while
   if (x < y) z <- x
   else z <- y
will see the eager top level rush off to do your bidding
at the end of the first line and then be completely
baffled by an 'else' where it does not expect one.

It's the same reason that you break AFTER infix operators
instead of BEFORE.
   x <- y +
   z
works fine, while
   x <- y
   + z
doesn't.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] RE: unexpected 'else' in " else"

2022-10-28 Thread Jorgen Harmse via R-help
Thank you for the comment. I don�t like the vertical sprawl either, but I 
assumed for the examples that the variable names were supposed to be very long 
or were stand-ins for complicated sub-expressions. Sometimes it�s better to put 
sub-expression results into temporary variables (and possibly use delayedAssign 
to preserve short-circuit behaviour), but I was discussing my line-break 
preferences in case breaks are needed.

Regards,
Jorgen Harmse.


From: Ebert,Timothy Aaron 
Date: Friday, 28October, 2022 at 10:22
To: Jorgen Harmse , r-help@r-project.org 

Subject: [EXTERNAL] RE: unexpected 'else' in " else"
I appreciate this thread on coding. My preference for reading is to have 
complete sentences.
I can read this:
{ if (x On Behalf Of Jorgen Harmse via 
R-help
Sent: Friday, October 28, 2022 10:39 AM
To: r-help@r-project.org
Subject: Re: [R] unexpected 'else' in " else"

[External Email]

Richard O'Keefe's remarks on the workings of the interpreter are correct, but 
the code examples are ugly and hard to read. (On the other hand, anyone who has 
used the debugger may be de-sensitised to horrible code formatting.) The use of 
whitespace should if possible reflect the structure of the code, and I would 
usually rather throw in a few extra delimiters than obscure the structure.

Regards,
Jorgen Harmse.


Examples (best viewed in a real text editor so things line up):


{ if (x
To: Jinsong Zhao 
Cc: "r-help@r-project.org" 
Subject: Re: [R] unexpected 'else' in " else"
Message-ID:

Content-Type: text/plain; charset="utf-8"

...

The basic issue is that the top level wants to get started on your command AS 
SOON AS IT HAS A COMPLETE COMMAND, and if (...) stmt is complete.  It's not 
going to hang around "Waiting for Godot"
for an 'else' that might never ever ever turn up.  So
   if (x < y) z <-
   x else z <- y
is absolutely fine, no braces needed, while
   if (x < y) z <- x
   else z <- y
will see the eager top level rush off to do your bidding at the end of the 
first line and then be completely baffled by an 'else' where it does not expect 
one.

It's the same reason that you break AFTER infix operators instead of BEFORE.
   x <- y +
   z
works fine, while
   x <- y
   + z
doesn't.





[[alternative HTML version deleted]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Get data from a list of data frames (Stefano Sofia)

2022-12-16 Thread Jorgen Harmse via R-help
Following Bert Gunter's suggestion, I wonder why the data are in separate 
frames (with hard-coded values) in the first place. You could put them in a 
text file and call read.table. If you provide a header and put a meaningful 
station name at the start of each data row then rownames of your data frame 
will be meaningful.

Regards,
Jorgen Harmse.

From: R-help  on behalf of 
r-help-requ...@r-project.org 
Date: Friday, 16December, 2022 at 05:00
To: r-help@r-project.org 
Subject: [EXTERNAL] R-help Digest, Vol 238, Issue 16

Message: 1
Date: Thu, 15 Dec 2022 13:52:35 +
From: Stefano Sofia 
To: "r-help@R-project.org" 
Subject: [R] Get data from a list of data frames
Message-ID: 
Content-Type: text/plain; charset="utf-8"

Dear R-list users,

I have a list of n data frames built as follows:


Station1 <- data.frame(sensor = c("thermometer", "raingauge", "snowgauge", 
"anemometer"), code = c(2583, 1478, 3178, NA))
Station2 <- data.frame(sensor = c("thermometer", "raingauge", "snowgauge", 
"anemometer"), code = c(2584, 1479, 3179, 4453))


Total <- list("Station1"=Station1, "Station2"=Station2, ...)

I would need to have a vector with the sensor codes of the thermometers of some 
stations, let's say of Station1, Station2 and Station5 (i.e. c(2583, 2584, 
2587)).
I tried with lapply, but I have not been able to get what I need.
Could you please help me?

Thank you

Stefano



 (oo)
--oOO--( )--OOo--
Stefano Sofia PhD
Civil Protection - Marche Region - Italy
Meteo Section
Snow Section
Via del Colle Ameno 5
60126 Torrette di Ancona, Ancona (AN)
Uff: +39 071 806 7743
E-mail: stefano.so...@regione.marche.it
---Oo-oO



AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere 
informazioni confidenziali, pertanto � destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
� il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed 
urgenza, la risposta al presente messaggio di posta elettronica pu� essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.

--
Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non infetto.
This message was scanned by Libraesva ESG and is believed to be clean.


[[alternative HTML version deleted]]





--

Message: 4
Date: Thu, 15 Dec 2022 09:31:51 -0800
From: Bert Gunter 
To: Gerrit Eichner ,
stefano.so...@regione.marche.it
Cc: r-help@r-project.org
Subject: Re: [R] Get data from a list of data frames
Message-ID:

Content-Type: text/plain; charset="utf-8"

Well, just for giggles, Gerrit's solution can be easily vectorized (i.e. no
apply()-type stuff needed) IFF the structure of all the data frames are
identical so that rbind() works:

d <-do.call('rbind',Total[select.stat]) ## one data frame to combine them
all ;-)
d[d$sensor == 'thermometer','code']

Whether this is better, worse, or unneeded, I cannot say.

Cheers,
Bert



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function doesn't exists but still runs..... (akshay kulkarni)

2023-01-20 Thread Jorgen Harmse via R-help
It may help to expand a bit on Bill Dunlap's answer. I think that library does 
something like this:



Create a new environment for all the package objects. This environment will not 
be directly visible from .GlobalEnv, and ancestor environments may not be 
directly visible either. It may contain functions & other objects that are not 
exported, and it may use objects in ancestor environments that .GlobalEnv 
doesn't see directly. On the other hand, functions in the package will still 
see external functions in the way the package author intended instead of seeing 
functions with the same name that are visible to .GlobalEnv.



Run the source code in the private environment (using source(local=private 
environment, )?). Most package source code just defines functions, but the 
source code could build other objects that the package needs for some reason, 
or it could use delayedAssign to build the objects lazily. By default, the 
environment of any function defined by the source code is the private 
environment, so the function has access to private objects and to anything in 
ancestor environments.



Create a second new environment whose parent is parent.env(.GlobalEnv). For 
every export, assign the corresponding object from the private environment into 
the corresponding name in the public environment. Note that the environment of 
any function is still the private environment in which it was created. (I think 
that a function is mostly determined by its environment, its formals, and its 
body. A function call creates a new environment whose parent is the environment 
of the function. Thus whoever wrote the function can control the search for 
anything that isn�t passed in or created by the function itself.)



Reset parent.env(.GlobalEnv) to be the public environment. This makes all the 
exported objects (usually functions) available at the command line and allows 
the user to see everything that was available before (usually by name only, but 
by scope-resolved name if necessary). As noted by Bill Dunlap and in more 
detail above, package functions can use functions & other objects that are not 
directly visible to the user. As he also showed, you can (usually) pierce the 
privacy as long at least one function is exported. 
environment(package_function) is the private environment, so you can use it to 
see all the private objects and everything in the ancestor environments. You 
can repeat the trick to see private environments of packages you didn't 
directly pull in. I think you can even unlock bindings and do ghastly things to 
the package's private environment.



Regards,

Jorgen Harmse.

--

Message: 17
Date: Thu, 19 Jan 2023 16:02:31 -0800
From: Bill Dunlap 
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] function doesn't exists but still runs.
Message-ID:

Content-Type: text/plain; charset="utf-8"

Look into R's scoping rules.  E.g.,
https://bookdown.org/rdpeng/rprogdatascience/scoping-rules-of-r.html.

* When a function looks up a name, it looks it up in the environment in
which the function was defined.
* Functions in a package are generally defined in the package's environment
(although sometimes they are in a descendent of the parent's environment).
* When one searches an environment for a name, if it is not found in the
environment the search continues in the parent environment of that
environment, recursively until the parent environment is the empty
environment.

> with(environment(wdman::selenium), java_check)
function ()
{
javapath <- Sys.which("java")
if (identical(unname(javapath), "")) {
stop("PATH to JAVA not found. Please check JAVA is installed.")
}
javapath
}



-Bill

On Thu, Jan 19, 2023 at 2:28 PM akshay kulkarni 
wrote:

> dear members,
> I am using the RSelenium package which uses
> the function selenium() from the wdman package. The selenium function
> contains the function java_check at line 12. If I try to run it, it throws
> an error:
>
> >   javapath <- java_check()
> Error in java_check() : could not find function "java_check"
>
> Also:
>
> > exists("java_check")
> [1] FALSE
>
> But when I run selenium(), it works fine
>
> How do you explain this conundrum? You can refer to this link:
> https://github.com/ropensci/wdman/issues/15
>
> Specifically what concept of R explains this weird behaviour?
>
> Thanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]




--

Subject: Digest Footer

_

Re: [R] [EXTERNAL] Re: function doesn't exists but still runs..... (akshay kulkarni)

2023-01-20 Thread Jorgen Harmse via R-help
Hi Akshay,

Lexical scoping and environments are closely tied. (I think Bill even cited the 
documentation.) I guess it's arcane in the sense that scoping usually does what 
you expect, but the way that works is related to what we discussed.

What led you to discover the issue? Were you debugging the public package 
function because it didn't do what you expected, or were you just curious how 
it worked?

Regards,
Jorgen.


From: akshay kulkarni 
Date: Friday, January 20, 2023 at 11:19
To: Jorgen Harmse , r-help@r-project.org 
, williamwdun...@gmail.com 
Subject: [EXTERNAL] Re: function doesn't exists but still runs. (akshay 
kulkarni)
Dear Jorgen,
 thanks for the reply.so according to you one can 
pegion hole the problem as concerning R's lexical scoping rules,am I right? Or 
some arcane concept regarding environments?

THanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Jorgen Harmse 
Sent: Friday, January 20, 2023 9:34 PM
To: r-help@r-project.org ; akshay...@hotmail.com 
; williamwdun...@gmail.com 
Subject: Re: function doesn't exists but still runs. (akshay kulkarni)


It may help to expand a bit on Bill Dunlap's answer. I think that library does 
something like this:



Create a new environment for all the package objects. This environment will not 
be directly visible from .GlobalEnv, and ancestor environments may not be 
directly visible either. It may contain functions & other objects that are not 
exported, and it may use objects in ancestor environments that .GlobalEnv 
doesn't see directly. On the other hand, functions in the package will still 
see external functions in the way the package author intended instead of seeing 
functions with the same name that are visible to .GlobalEnv.



Run the source code in the private environment (using source(local=private 
environment, )?). Most package source code just defines functions, but the 
source code could build other objects that the package needs for some reason, 
or it could use delayedAssign to build the objects lazily. By default, the 
environment of any function defined by the source code is the private 
environment, so the function has access to private objects and to anything in 
ancestor environments.



Create a second new environment whose parent is parent.env(.GlobalEnv). For 
every export, assign the corresponding object from the private environment into 
the corresponding name in the public environment. Note that the environment of 
any function is still the private environment in which it was created. (I think 
that a function is mostly determined by its environment, its formals, and its 
body. A function call creates a new environment whose parent is the environment 
of the function. Thus whoever wrote the function can control the search for 
anything that isn�t passed in or created by the function itself.)



Reset parent.env(.GlobalEnv) to be the public environment. This makes all the 
exported objects (usually functions) available at the command line and allows 
the user to see everything that was available before (usually by name only, but 
by scope-resolved name if necessary). As noted by Bill Dunlap and in more 
detail above, package functions can use functions & other objects that are not 
directly visible to the user. As he also showed, you can (usually) pierce the 
privacy as long at least one function is exported. 
environment(package_function) is the private environment, so you can use it to 
see all the private objects and everything in the ancestor environments. You 
can repeat the trick to see private environments of packages you didn't 
directly pull in. I think you can even unlock bindings and do ghastly things to 
the package's private environment.



Regards,

Jorgen Harmse.

--

Message: 17
Date: Thu, 19 Jan 2023 16:02:31 -0800
From: Bill Dunlap 
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] function doesn't exists but still runs.
Message-ID:

Content-Type: text/plain; charset="utf-8"

Look into R's scoping rules.  E.g.,
https://bookdown.org/rdpeng/rprogdatascience/scoping-rules-of-r.html.

* When a function looks up a name, it looks it up in the environment in
which the function was defined.
* Functions in a package are generally defined in the package's environment
(although sometimes they are in a descendent of the parent's environment).
* When one searches an environment for a name, if it is not found in the
environment the search continues in the parent environment of that
environment, recursively until the parent environment is the empty
environment.

> with(environment(wdman::selenium), java_check)
function ()
{
javapath <- Sys.which("java")
if (identical(unname(javapath), "")) {
stop("PATH to JAVA not found. Please check JAVA is installed.")
}
javapath
}



-Bill

On Thu, Jan 19, 2023 at 2:28 PM akshay kulkarni 
wrote:

> dear members,
>  

Re: [R] [EXTERNAL] Re: function doesn't exists but still runs..... (akshay kulkarni)

2023-01-23 Thread Jorgen Harmse via R-help
Hi Akshay,

I usually use debug (a function provided by R). When you are stepping through a 
function your environment is the one in which function code is being executed, 
so you can easily see everything that is visible to the function. If you single 
step into a function that the first function calls then you also see everything 
that is available to that function. Moreover, you don't see anything that is 
not visible to the function you are debugging, so you can really determine what 
any piece of code would do if called inside the function.

Note 1: Code in R is always executed in an environment. I show in the example 
below that in the empty environment (the ultimate ancestor of all other 
environments) R can't even add. Usually the current environment (e.g. 
.GlobalEnv at the command line or a fresh environment created by a function 
call) has the right contents and the right parent to do what you expect, but in 
some special cases you need to understand how environments work. Even evalq & 
with are functions (unavailable for example in the empty environment), and the 
environment argument has to be evaluated in the current environment before the 
main expression can be evaluated in the environment that you want.


> E[1]

[[1]]





> evalq(2+2, E[[1L]])

Error in 2 + 2 : could not find function "+"

> evalq(2L+2L, E[[1L]])

Error in 2L + 2L : could not find function "+"

Note 2: Besides automatically showing what a function sees, using debug 
(instead of hand-executing lines of code from the function) gives you the 
correct call stack. Suppose that you run some code at the debug command line to 
make a sub-sub-…-function do what you want, and you create in that environment 
what you hope is the correct return value. You can then use parent.frame() to 
put that value into the environment of the caller and run the caller's 
remaining code in the correct environment to see what happens. If there are no 
other problems then you can work your way up to top level and confirm that your 
patch has the right effect before you even modify the actual code of the 
offending function. (Saving all environments in .GlobalEnv forces R to keep 
them even if you quit the debugger. Combining eval & parse is sometimes more 
convenient than using evalq.)

Regards,
Jorgen.

--

Message: 2
Date: Sun, 22 Jan 2023 14:25:59 +
From: akshay kulkarni 
To: Jorgen Harmse , "r-help@r-project.org"
, "williamwdun...@gmail.com"

Subject: Re: [R] [EXTERNAL] Re: function doesn't exists but still
runs. (akshay kulkarni)
Message-ID:



Content-Type: text/plain; charset="utf-8"

Dear Jorgen,
regrets to reply this late
I got into this issue because it threw an error, and it took more than 4 days 
to fix this. I  learnt a lot, and one things I learnt  is to debug the 
function, even when that is a public package functioninstead of googling 
the error message...Any ideas on how to do this more efficiently?

THanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Jorgen Harmse 
Sent: Friday, January 20, 2023 11:35 PM
To: akshay kulkarni ; r-help@r-project.org 
; williamwdun...@gmail.com 
Subject: Re: [EXTERNAL] Re: function doesn't exists but still runs. (akshay 
kulkarni)


Hi Akshay,



Lexical scoping and environments are closely tied. (I think Bill even cited the 
documentation.) I guess it's arcane in the sense that scoping usually does what 
you expect, but the way that works is related to what we discussed.



What led you to discover the issue? Were you debugging the public package 
function because it didn't do what you expected, or were you just curious how 
it worked?



Regards,

Jorgen.





From: akshay kulkarni 
Date: Friday, January 20, 2023 at 11:19
To: Jorgen Harmse , r-help@r-project.org 
, williamwdun...@gmail.com 
Subject: [EXTERNAL] Re: function doesn't exists but still runs. (akshay 
kulkarni)

Dear Jorgen,

 thanks for the reply.so according to you one can 
pegion hole the problem as concerning R's lexical scoping rules,am I right? Or 
some arcane concept regarding environments?



THanking you,

Yours sincerely,

AKSHAY M KULKARNI



From: Jorgen Harmse 
Sent: Friday, January 20, 2023 9:34 PM
To: r-help@r-project.org ; akshay...@hotmail.com 
; williamwdun...@gmail.com 
Subject: Re: function doesn't exists but still runs. (akshay kulkarni)



It may help to expand a bit on Bill Dunlap's answer. I think that library does 
something like this:



Create a new environment for all the package objects. This environment will not 
be directly visible from .GlobalEnv, and ancestor environments may not be 
directly visible either. It may contain functions & other objects that are not 
exported, and it may use objects in ancestor environments that .GlobalEnv 
doesn't see directly. On the other hand, functions in the package wi

Re: [R] preserve class in apply function

2023-02-08 Thread Jorgen Harmse via R-help
What are you trying to do? Why use apply when there is already a vector 
addition operation?
df$x+df$y or as.numeric(df$x)+as.numeric(df$y) or 
rowSums(as.numeric(df[c('x','y')])).

As noted in other answers, apply will coerce your data frame to a matrix, and 
all entries of a matrix must have the same type.

Regards,
Jorgen Harmse.

Message: 1
Date: Tue, 7 Feb 2023 07:51:50 -0500
From: Naresh Gurbuxani 
To: "r-help@r-project.org" 
Subject: [R] preserve class in apply function
Message-ID:



Content-Type: text/plain; charset="us-ascii"


> Consider a data.frame whose different columns have numeric, character,
> and factor data.  In apply function, R seems to pass all elements of a
> row as character.  Is it possible to preserve numeric class?
>
>> mydf <- data.frame(x = rnorm(10), y = runif(10))
>> apply(mydf, 1, function(row) {row["x"] + row["y"]})
> [1]  0.60150197 -0.74201827  0.80476392 -0.59729280 -0.02980335  0.31351909
> [7] -0.63575990  0.22670658  0.55696314  0.39587314
>> mydf[, "z"] <- sample(letters[1:3], 10, replace = TRUE)
>> apply(mydf, 1, function(row) {row["x"] + row["y"]})
> Error in row["x"] + row["y"] (from #1) : non-numeric argument to binary 
> operator
>> apply(mydf, 1, function(row) {as.numeric(row["x"]) + as.numeric(row["y"])})
> [1]  0.60150194 -0.74201826  0.80476394 -0.59729282 -0.02980338  0.31351912
> [7] -0.63575991  0.22670663  0.55696309  0.39587311
>> apply(mydf[,c("x", "y")], 1, function(row) {row["x"] + row["y"]})
> [1]  0.60150197 -0.74201827  0.80476392 -0.59729280 -0.02980335  0.31351909
> [7] -0.63575990  0.22670658  0.55696314  0.39587314





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Could you manually replicate execution of a R function

2023-09-20 Thread Jorgen Harmse via R-help
There may be collisions between variables in .GlobalEnv and variables in the 
function-call environment, and the parent of the function-call environment 
probably includes functions & other variables not available in .GlobalEnv. (If 
the function calls substitute or anything like that then the problem becomes 
even harder.) I would probably use the debugger to step into the function. If 
you want more control then create an environment that resembles what would be 
created in a function call:

env.func <- new.env(parent=environment(f))
delayedAssign(assign.env=env.func, �.) for everything you pass in
delayedAssign(assign.env=env.func, eval.env=env.func, �.) for anything that 
will take a default value
eval(envir=env.func, �.) or evalq(envir=env.func, �.) to execute parts of the 
function body or anything else

You can even coerce body(f) to a character, strip the leading �{�, and 
parse(text=�.) to break the function into expressions. That might be easier 
than copy-pasting function code.

Regards,
Jorgen Harmse.


Message: 1
Date: Tue, 19 Sep 2023 23:09:18 +0530
From: Brian Smith 
To: r-help@r-project.org
Subject: [R] Could you manually replicate execution of a R function
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hi,

I have trying to replicate a function from rugarch package manually.
Below is the calculation based on the function,

�

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save(), load(), saveRDS(), and readRDS()

2023-09-29 Thread Jorgen Harmse via R-help
Ivan Krylov points out that load(file, e <- new.env()) is cumbersome. I put it 
into a function.

Regards,
Jorgen Harmse.


#' Save & load lists & environments

#'

#' \code{\link{save}} has to be told what to save from an environment, and the 
obvious way

#' to save a structure creates an extra layer. \code{\link{load}} with default 
settings

#' clobbers the current environment. \code{save.env} saves a list or 
environment without an

#' extra layer, and by default saves everything. \code{load.env} loads into an 
environment,

#' and \code{load.list} loads into a list.

#'

#' @param S something that can be coerced to an environment, e.g. a named 
\code{list}

#' @param file,envir inputs to \code{save} or \code{load}

#' @param list input to \code{save}

#' @param skip variables in \code{envir} that should not be saved, ignored if 
\code{list}

#' is provided

#' @param ... inputs to \code{load.env} or additional inputs to \code{save}

#'

#' @return \code{invisible} from \code{save.env}; an \code{environment} from 
\code{load.env};

#' a \code{list} from \code{load.list}

#'

#' @export



save.env <- function( S, file, list = setdiff(ls(envir),skip),

  envir = if(missing(S)) parent.frame() else 
as.environment(S),

  skip=NULL, ...

)

{ save(list=list, file=file, envir=envir, ...)}



#' @rdname save.env

#'

#' @param keep,remove names of variables to keep or to remove

#' @param absent what to do if variables named in \code{keep} are absent

#' @param parent input to \code{\link{new.env}}

#'

#' @note \code{remove} is forced after the file is loaded, so the default works 
correctly.

#'

#' @export



load.env <- function( file, keep, remove = if(!missing(keep)) 
setdiff(ls(envir),keep),

  absent=c('warn','ignore','stop'), 
envir=new.env(parent=parent),

  parent=parent.frame()

)

{ load(file,envir)

  rm(list=remove,envir=envir)

  if ( !missing(keep) && (match.arg(absent) -> absent) != 'ignore'

  && length(keep.absent <- setdiff(keep,ls(envir))) > 0L )

  { print(keep.absent)

if (absent=='warning')

  warning('The variables listed above are absent from the file.')

else

  stop('The variables listed above are absent from the file.')

  }

  return(envir)

}



#' @rdname save.env

#'

#' @param all.names input to \code{\link{as.list}}

#'

#' @export



load.list <- function(..., all.names=TRUE) as.list(all.names=all.names, 
load.env(...))




--

Message: 2
Date: Fri, 29 Sep 2023 11:42:37 +0300
From: Ivan Krylov 
To: Shu Fai Cheung 
Cc: R mailing list 
Subject: Re: [R] save(), load(), saveRDS(), and readRDS()
Message-ID: <20230929114237.2592975a@Tarkus>
Content-Type: text/plain; charset="utf-8"

On Thu, 28 Sep 2023 23:46:45 +0800
Shu Fai Cheung  wrote:

> In my personal work, I prefer using saveRDS() and loadRDS() as I
> don't like the risk of overwriting anything in the global
> environment.

There's the load(file, e <- new.env()) idiom, but that's potentially
a lot to type.

*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023-11-03 Thread Jorgen Harmse via R-help
df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))

That will do both calculations and merge the two vectors appropriately. It will 
use extra memory, but it should be much faster than a 'for' loop.

Regards,
Jorgen Harmse.

--

Message: 8
Date: Fri, 3 Nov 2023 11:10:49 +1030
From: "Md. Kamruzzaman" 
To: r-help@r-project.org
Subject: [R] I need to create new variables based on two numeric
variables and one dichotomize conditional category variables.
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hello Everyone,
I have three variables: Waist circumference (WC), serum triglyceride (TG)
level and gender. Waist circumference and serum triglyceride is numeric and
gender (male and female) is categorical. From these three variables, I want
to calculate the "Lipid Accumulation Product (LAP) Index". The equation to
calculate LAP is different for male and females. I am giving both equations
below.

LAP for male = (WC-65)*TG
LAP for female = (WC-58)*TG

My question is 'how can I calculate the LAP and create a single new column?

Your cooperation will be highly appreciated.

Thanks in advance.

With Regards

**

*Md Kamruzzaman*

*PhD **Research Fellow (**Medicine**)*
Discipline of Medicine and Centre of Research Excellence in Translating
Nutritional Science to Good Health
Adelaide Medical School | Faculty of Health and Medical Sciences
The University of Adelaide
Adelaide SA 5005

[[alternative HTML version deleted]]



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] RE: I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023-11-03 Thread Jorgen Harmse via R-help
Yes, that will halve the number of multiplications.

If you�re looking for such optimisations then you can also consider 
ifelse(G=='male', 65L, 58L). That will definitely use less time & memory if WC 
is integer, but the trade-offs are more complicated if WC is floating point.

Regards,
Jorgen Harmse.



From: avi.e.gr...@gmail.com 
Date: Friday, November 3, 2023 at 16:12
To: Jorgen Harmse , r-help@r-project.org 
, mkzama...@gmail.com 
Subject: [EXTERNAL] RE: [R] I need to create new variables based on two numeric 
variables and one dichotomize conditional category variables.
Just a minor point in the suggested solution:

df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))

since WC and TG are not conditional, would this be a slight improvement?

df$LAP <- with(df, TG*(WC - ifelse(G=='male', 65, 58)))



-Original Message-
From: R-help  On Behalf Of Jorgen Harmse via
R-help
Sent: Friday, November 3, 2023 11:56 AM
To: r-help@r-project.org; mkzama...@gmail.com
Subject: Re: [R] I need to create new variables based on two numeric
variables and one dichotomize conditional category variables.

df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))

That will do both calculations and merge the two vectors appropriately. It
will use extra memory, but it should be much faster than a 'for' loop.

Regards,
Jorgen Harmse.

--

Message: 8
Date: Fri, 3 Nov 2023 11:10:49 +1030
From: "Md. Kamruzzaman" 
To: r-help@r-project.org
Subject: [R] I need to create new variables based on two numeric
variables and one dichotomize conditional category variables.
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hello Everyone,
I have three variables: Waist circumference (WC), serum triglyceride (TG)
level and gender. Waist circumference and serum triglyceride is numeric and
gender (male and female) is categorical. From these three variables, I want
to calculate the "Lipid Accumulation Product (LAP) Index". The equation to
calculate LAP is different for male and females. I am giving both equations
below.

LAP for male = (WC-65)*TG
LAP for female = (WC-58)*TG

My question is 'how can I calculate the LAP and create a single new column?

Your cooperation will be highly appreciated.

Thanks in advance.

With Regards

**

*Md Kamruzzaman*

*PhD **Research Fellow (**Medicine**)*
Discipline of Medicine and Centre of Research Excellence in Translating
Nutritional Science to Good Health
Adelaide Medical School | Faculty of Health and Medical Sciences
The University of Adelaide
Adelaide SA 5005

[[alternative HTML version deleted]]



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023-11-06 Thread Jorgen Harmse via R-help
That�s ingenious, but I would hesitate to rely on a specific mapping between 
strings and integers. (I usually read data frames with stringsAsFactors=FALSE 
or coerce to character later: I don�t think it takes more memory.) Maybe create 
another column with the coefficients. What if gender is part of another formula?

Regards,
Jorgen Harmse.

From: CALUM POLWART 
Date: Saturday, November 4, 2023 at 18:23
To: avi.e.gr...@gmail.com 
Cc: Jorgen Harmse , r-help@r-project.org 
, mkzama...@gmail.com 
Subject: [EXTERNAL] Re: [R] I need to create new variables based on two numeric 
variables and one dichotomize conditional category variables.
I might have factored the gender.

I'm not sure it would in any way be quicker.  But might be to some extent 
easier to develop variations of. And is sort of what factors should be doing...

# make dummy data
gender <- c("Male", "Female", "Male", "Female")
WC <- c(70,60,75,65)
TG <- c(0.9, 1.1, 1.2, 1.0)
myDf <- data.frame( gender, WC, TG )

# label a factor
myDf$GF <- factor(myDf$gender, labels= c("Male"=65, "Female"=58))

# do the maths
myDf$LAP <- (myDf$WC - as.numeric(myDf$GF))* myDf$TG

#show results
head(myDf)

gender WC  TG GF  LAP
1   Male 70 0.9 58 61.2
2 Female 60 1.1 65 64.9
3   Male 75 1.2 58 87.6
4 Female 65 1.0 65 64.0


(Reality: I'd have probably used case_when in tidy to create a new numeric 
column)




The equation to
calculate LAP is different for male and females. I am giving both equations
below.

LAP for male = (WC-65)*TG
LAP for female = (WC-58)*TG

My question is 'how can I calculate the LAP and create a single new column?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I need to create new variables based on two numeric variables and one dichotomize conditional category

2023-11-06 Thread Jorgen Harmse via R-help
Avi: Thank you for checking. I think the optimization is limited. If test is 
all TRUE or all FALSE then at most one vector is evaluated. Anything beyond 
that would be very complicated. (Inspect the two expressions and verify that 
both specify elementwise computations. Then use indexing to shrink the input 
properly. Take into account all recycling rules for binary operations.)


> ifelse(0:1, log(-1:0), 1:2)

Warning in log(-1:0) : NaNs produced

[1]1 -Inf

> ifelse(c(FALSE,FALSE), log(-1:0), 1:2)

[1] 1 2

I agree that nested ifelse is cumbersome. I wrote a function to address that:


#' Nested conditional element selection

#'

#' \code{ifelses(test1,yes1,test2,yes2,,no)} is shorthand for

#' \code{ifelse(test1,yes1,ifelse(test2,yes2,,no))}. The inputs should

#' not be named.

#'

#' @param test1 usually \code{test} for the outer call to \code{\link{ifelse}}

#' @param yes1 \code{yes} for the outer call to \code{ifelse}

#' @param ... usually the \code{(test,yes)} for nested calls followed by 
\code{no}

#' for the innermost call to \code{ifelse}

#'

#' @note There must be an odd number of inputs. If there is exactly one input 
then it is

#' returned (unless it is named \code{yes1}): this supports the recursive 
implementation.

#'

#' @return a vector with entries from \code{yes1} where \code{test1} is 
\code{TRUE}, else from

#' \code{yes2} where \code{test2} is \code{TRUE}, ..., and from \code{no} where 
none of

#' the conditions holds

#'

#' @export



ifelses <- function(test1,yes1,...)

{ if (missing(test1))

  { if (!missing(yes1) || length(L <- list(...)) != 1L)

  stop("Wrong number of arguments or confusing argument names.")

return(L[[1L]])

  }

  if (missing(yes1))

  { if (length(L <- list(...)) != 0L)

  stop("Wrong number of arguments or confusing argument names.")

return(test1)

  }

  return( ifelse(test1, yes1, ifelses(...)) )

}

Regards,
Jorgen Harmse (not Jordan).

--

Message: 10
Date: Sat, 4 Nov 2023 01:08:03 -0400
From: 
To: "'Jorgen Harmse'" 
Cc: 
Subject: Re: [R] [EXTERNAL] RE: I need to create new variables based
on two numeric variables and one dichotomize conditional category
variables.
Message-ID: <019a01da0edc$e41c39e0$ac54ada0$@gmail.com>
Content-Type: text/plain; charset="utf-8"

To be fair, Jordan, I think R has some optimizations so that the arguments
in some cases are NOT evaluated until needed. So only one or the other
choice ever gets evaluated for each row. My suggestion merely has
typographic implications and some aspects of clarity and minor amounts of
less memory and parsing needed.

But ifelse() is currently implemented somewhat too complexly for my taste.
Just type "ifelse" at the prompt and you will see many lines of code that
handle various scenarios.

�

If you later want to add categories such as �transgender� with a value of 61 or 
have other numbers for groups like �Hispanic male�, you can amend the 
instructions as long as you put your conditions in an order so that they are 
tried until one of them matches, or it takes the default. Yes, in a sense the 
above is doable using a deeply nested ifelse() but easier for me to read and 
write and evaluate. It may not be more efficient or may be as some of dplyr is 
compiled code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Building Packages.

2024-03-20 Thread Jorgen Harmse via R-help
I have a source file with oxygen-style comments (and description & licence 
files), and I’m trying to build a package. oxygen & devtools seem to work, and 
the tarball exists, but install.packages balks. Does anyone know what’s 
happening?

Regards,
Jorgen Harmse.


> roxygenise(package.dir,clean=TRUE)

Setting `RoxygenNote` to "7.3.1"

✖ roxygen2 requires "Encoding: UTF-8"

ℹ Current encoding is NA

ℹ Loading jhBase

Warning: ── Conflicts 
──
 jhBase conflicts

──

✖ `andNotNA` masks `jhBase::andNotNA()`.

✖ `array.named` masks `jhBase::array.named()`.

✖ `arrayInd.inv` masks `jhBase::arrayInd.inv()`.

  … and more.

ℹ Did you accidentally source a file rather than using `load_all()`?

  Run `rm(list = c("andNotNA", "array.named", "arrayInd.inv", 
"as.POSIXct_orig", "build.package", "colon", "file.info", "file.path", 
"files.removeDup", "fprintf", "grepi", "grepiv", "grepv", "grepvi", "ifelses", 
"index", "load.env", "load.list", "matrix.sq", "merges", "mm", "orNA", 
"pattern.NA", "plots", "printf",

  "save.env", "subs", "symmDiff", "vector.named", "width"))` to remove the 
conflicts.

Writing NAMESPACE

Writing printf.Rd

Writing width.Rd

Writing pattern.NA.Rd

Writing ddply.ns.Rd

Writing as.POSIXct_orig.Rd

Writing mm.Rd

Writing orNA.Rd

Writing merges.Rd

Writing index.Rd

Writing save.env.Rd

Writing build.package.Rd

Writing plots.Rd

Writing ifelses.Rd

Writing subs.Rd

Writing array.named.Rd

Writing grepv.Rd

Writing symmDiff.Rd

Writing file.info.Rd

Writing file.path.Rd

Writing files.removeDup.Rd

Writing arrayInd.inv.Rd

> tar <- devtools::build(package.dir)

── R CMD build 


✔  checking for file 
‘/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase/DESCRIPTION’ ...

─  preparing ‘jhBase’:

✔  checking DESCRIPTION meta-information ...

─  checking for LF line-endings in source and make files and shell scripts

─  checking for empty or unneeded directories

─  building ‘jhBase_1.0.1.tar.gz’



> file.info(tar)

  size 
isdir mode   mtime   ctime   atime uid gid  
 uname grname

/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz 14030 
FALSE  644 2024-03-20 10:49:10 2024-03-20 10:49:10 2024-03-20 10:49:10 503  20 
jharmse  staff

> install.packages(tar,type='source',repos=NULL)

Error in library(jhBase) : there is no package called ‘jhBase’

Execution halted

Warning in install.packages(tar, type = "source", repos = NULL) :

  installation of package 
‘/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz’ had 
non-zero exit status


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building Packages.

2024-03-20 Thread Jorgen Harmse via R-help
Thank you, but I think I was already using utils.

Regards,
Jorgen.


> environment(install.packages)



> utils::install.packages('/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz',type='source',repos=NULL)

Error in library(jhBase) : there is no package called �jhBase�

Execution halted

Warning in 
utils::install.packages("/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz",
  :

  installation of package 
�/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz� had 
non-zero exit status


From: Ivan Krylov 
Date: Wednesday, March 20, 2024 at 11:14
To: Jorgen Harmse via R-help 
Cc: Jorgen Harmse 
Subject: [EXTERNAL] Re: [R] Building Packages.
� Wed, 20 Mar 2024 16:02:27 +
Jorgen Harmse via R-help  �:

> > install.packages(tar,type='source',repos=NULL)
>
> Error in library(jhBase) : there is no package called �jhBase�
>
> Execution halted
>
> Warning in install.packages(tar, type = "source", repos = NULL) :
>
>   installation of package
> �/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz�
> had non-zero exit status

Using RStudio? It happens to override install.packages with a function
that doesn't quite handle file paths. Try utils::install.packages(tar,
type = "source", repos = NULL).

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building Packages.

2024-03-20 Thread Jorgen Harmse via R-help
I was thinking of making it Open Source, but I haven’t yet. It’s mostly a 
collection of small utility functions (more oxygen comments than actual code). 
I built the package on my Windows machine a few months ago, but my Mac first 
wouldn’t install roygen2 & devtools and now (with the latest version of R) 
won’t install the tarball that they create. (A work-around that I might try 
again is to build the package under Windows and ship the tarball to my Mac.)

Regards,
Jorgen.

Example:

#' Adjust number of columns used in printing

#'

#' Use \code{\link{options}} to determine the current number of columns, 
increment

#' or decrement, and pass the result as \code{width} in a second call to 
\code{options}.

#'

#' @param dw signed amount by which to increment the number of columns

#'

#' @return a list with the old value of \code{options('width')}

#'

#' @export



width <- function(dw) options(width = options('width')[[1L]] + as.integer(dw))


From: Duncan Murdoch 
Date: Wednesday, March 20, 2024 at 12:09
To: Jorgen Harmse , Ivan Krylov , Jorgen 
Harmse via R-help 
Subject: [EXTERNAL] Re: [R] Building Packages.
Is the source for your package online somewhere?

Duncan Murdoch

On 20/03/2024 1:00 p.m., Jorgen Harmse via R-help wrote:
> Thank you, but I think I was already using utils.
>
> Regards,
> Jorgen.
>
>
>> environment(install.packages)
>
> 
>
>> utils::install.packages('/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz',type='source',repos=NULL)
>
> Error in library(jhBase) : there is no package called �jhBase�
>
> Execution halted
>
> Warning in 
> utils::install.packages("/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz",
>   :
>
>installation of package 
> �/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz� 
> had non-zero exit status
>
>
> From: Ivan Krylov 
> Date: Wednesday, March 20, 2024 at 11:14
> To: Jorgen Harmse via R-help 
> Cc: Jorgen Harmse 
> Subject: [EXTERNAL] Re: [R] Building Packages.
> � Wed, 20 Mar 2024 16:02:27 +
> Jorgen Harmse via R-help  �:
>
>>> install.packages(tar,type='source',repos=NULL)
>>
>> Error in library(jhBase) : there is no package called �jhBase�
>>
>> Execution halted
>>
>> Warning in install.packages(tar, type = "source", repos = NULL) :
>>
>>installation of package
>> �/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz�
>> had non-zero exit status
>
> Using RStudio? It happens to override install.packages with a function
> that doesn't quite handle file paths. Try utils::install.packages(tar,
> type = "source", repos = NULL).
>
> --
> Best regards,
> Ivan
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building Packages.

2024-03-20 Thread Jorgen Harmse via R-help
Thank you. tools:::.install_packages works.

It happens that one of the functions in my package is a utility to build 
packages. I guess I should change the install step.

Regards,
Jorgen.



#' Build package from source

#'

#' \code{roxygen2} & \code{devtools} have several steps to build a package, and

#' \code{build.package} wraps them in one function. It can also clean up

#' and protect the description file.

#'

#' @param package.dir the directory with the package files arranged as expected 
by

#' \code{roxygen2} except as noted below

#' @param clean whether to remove old \code{Rd} files before starting

#' @param install whether to install the package after building it, which may 
make the latest

#' version available in the current R session

#'

#' @note The description file should be in \code{DESC.source} rather than 
\code{DESCRIPTION}.

#' \code{build.package} will then overwrite the machine-generated 
\code{DESCRIPTION} from the

#' previous build with the true source.

#'

#' @return \code{invisible()}

#'

#' @export



build.package <- function(package.dir,clean=TRUE,install=TRUE)

{ if (!require(roxygen2))

stop("Please install roxygen2 and its dependencies.")

  if (!require(devtools))

stop("Please install devtools and its dependencies.")

  if ( file.exists(file.path(package.dir,"DESC.source") -> DESC.source) )

file.copy(DESC.source, file.path(package.dir,"DESCRIPTION"), overwrite=TRUE)

  roxygenise(package.dir,clean=clean)

  tar <- devtools::build(package.dir)

  if (install)

install.packages(tar,type='source',repos=NULL)

  invisible()

}



From: Ivan Krylov 
Date: Wednesday, March 20, 2024 at 14:12
To: Jorgen Harmse 
Cc: Jorgen Harmse via R-help 
Subject: [EXTERNAL] Re: [R] Building Packages.
� Wed, 20 Mar 2024 17:00:34 +
Jorgen Harmse  �:

> Thank you, but I think I was already using utils.
>
> Regards,
> Jorgen.
>
>
> > environment(install.packages)
>
> 
>
> > utils::install.packages('/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz',type='source',repos=NULL)
> >
>
> Error in library(jhBase) : there is no package called �jhBase�

Sorry, then it has been my mistake to blame RStudio for this.

We can try debugging this. If you start a fresh R process and run
tools:::.install_packages(path_to_tarball), the installation will (try
to) proceed in the current process instead of a child process. Once it
fails, traceback() will be available to show you where the error
condition has been raised. What does it say?

Alternatively,

1. Check the package R files for stray library() calls. Generally,
packages should not be calling library().

2. Try a "binary search" approach. Make a copy of your package code but
remove half of the files (or half of the functions if they live in a
single file). Keep removing a half (or go to the other half) depending
on whether the same error keeps happening.

Good luck!

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building Packages.

2024-03-21 Thread Jorgen Harmse via R-help
> Turns out that RStudio replaces the install.packages object in the utils
> package.

> Duncan Murdoch

So RStudio unlocks the bindings and alters the exported environment? That seems 
like another reason to stick to the terminal interface.

>> Thank you. tools:::.install_packages works.

> I'm glad it works, but it shouldn't be necessary to use (and is not
> part of the API: not documented to keep working this way).
> Best regards,
> Ivan [Krylov]

Thank you for letting me know. I hope I can avoid using private functions in 
future. As noted below, my function seems to work now.

> Try setting a breakpoint in system2 before launching your function:

> Best regards,
> Ivan

Now build.package works as written, so there�s nothing to debug. The problem 
may have been that this package is so important to me that I put it in 
.Rprofile. The package was not installed for the new version of R, so every R 
session started with an annoying error message. Presumably a separate session 
started with R CMD would just fail without installing the package. That�s no 
longer a problem because the package is now installed. However, I don�t know 
why the error message wasn�t clearer, and I�m puzzled that I was able to 
install roxygen2 & devtools. Thank you everyone, and I�m sorry if I didn�t give 
the right information to diagnose the problem faster.

Regards,
Jorgen.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] duplicated() on zero-column data frames returns empty

2024-04-05 Thread Jorgen Harmse via R-help
(I do not know how to make Outlook send plain text, so I avoid apostrophes.)

For what it is worth, I agree with Mark Webster. The discussion by Ivan Krylov 
is interesting, but if duplicated really treated a row name as part of the row 
then any(duplicated(data.frame(�))) would always be FALSE. My expectation is 
that if key1 is a subset of key2 then all(duplicated(df[key1]) >= 
duplicated(df[key2])) should always be TRUE.

Incidentally, the examples for duplicated and the documentation of unique hint 
that unique(x) is the same as (but more efficient than) x[!duplicated(x)] (for 
a vector) or x[!duplicated(x)],,drop=FALSE] (for a data frame), and this seems 
to be true even in the corner case (with what I consider incorrect output from 
both functions) . On the other hand, I do not see any explicit guarantee about 
the order of entries in unique(x) (or setdiff(�) or intersect(�)). Code using 
these functions could be more efficient with explicit guarantees, but maybe the 
core team wants to preserve its own flexibility. My suggestion is to include 
some options so users can at least lock in the current behaviour (with a note 
that future versions may achieve it less efficiently). Other options might 
include sort=TRUE in case the core team develops something more efficient than 
sort(unique(�)).

Regards,
Jorgen.

--

Message: 2
Date: Fri, 5 Apr 2024 11:17:37 +0300
From: Ivan Krylov 
To: Mark Webster via R-help 
Cc: Mark Webster 
Subject: Re: [R]  duplicated() on zero-column data frames returns
empty vector
Message-ID: <20240405111737.2b7e4c3a@arachnoid>
Content-Type: text/plain; charset="utf-8"

Hello Mark,

� Fri, 5 Apr 2024 03:58:36 + (UTC)
Mark Webster via R-help  �:

> I found what looks to me like an odd edge case for duplicated(),
> unique() etc. on data frames with zero columns, due to duplicated()
> returning a zero-length vector for them, regardless of the number of
> rows:

> df <- data.frame(a = 1:5)
> df$a <- NULLnrow(df)
> # 5 (row count preserved by row.names)
> duplicated(df)
> # logical(0), should be c(FALSE, TRUE, TRUE, TRUE, TRUE)
> anyDuplicated(df)
> # 0, should be 2

> This behaviour isn't mentioned in the documentation; is there a
> reason for it to work like this?

<...>

> I admit this is a case we rarely care about.However, for an example
> of this being an issue, I've been running into it when treating data
> frames as database relations, where they have one or more candidate
> keys (irreducible subsets of the columns for which every row must
> have a unique value set).

Part of the problem is that it's not obvious what should be a
zero-column but non-zero-row data.frame mean.

On the one hand, your database relation use case is entirely valid. On
the other hand, if data.frames are considered to be tables of data with
row.names as their identifiers, then duplicated(d) should be returning
logical(nrow(d)) for zero-column data.frames, since row.names are
required to be unique. I'm sure that more interpretations can be
devised, requiring some other behaviour for duplicated() and friends.

Thankfully, duplicated() and anyDuplicated() are generic functions, and
you can subclass your data frames to change their behaviour:

duplicated.database_relation <- function(x, incomparables = FALSE, ...)
 if (length(x)) return(NextMethod()) else c(
  FALSE, rep(TRUE, nrow(x) - 1)
 )
.S3method('duplicated', 'database_relation')

anyDuplicated.database_relation <- function(
 x, incomparables = FALSE, ...
) if (nrow(x) > 1) 2 else 0
.S3method('anyDuplicated', 'database_relation')

x <- data.frame(row.names = 1:5)
class(x) <- c('database_relation', class(x))

duplicated(x)
# [1] FALSE  TRUE  TRUE  TRUE  TRUE
anyDuplicated(x)
# [1] 2
unique(x)
# data frame with 0 columns and 1 row

> [[alternative HTML version deleted]]

Since this mailing list eats the HTML parts of the e-mails, we only get
the plain text version automatically prepared by your mailer. This one
didn't look so good:
https://stat.ethz.ch/pipermail/r-help/2024-April/479143.html

Composing your messages to the list in plain text will help avoid the
problem.

--
Best regards,
Ivan



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] duplicated() on zero-column data frames returns empty

2024-04-08 Thread Jorgen Harmse via R-help
I appreciate the compliment from Ivan and still share the puzzlement at the 
empty return.

What is the policy for changing something that is wrong? There is a trade-off 
between breaking old code that worked around a problem and breaking new code 
written by people who make reasonable assumptions. Mathematically, it seems 
obvious to me that duplicated.matrix(A) should do something like this:

v <- matrix(FALSE, nrow = nrow(A) -> nr, ncol=1L) # or an ordinary vector?
if (nr > 1L) # Check because 2:0 & 2:1 do not do what we want.
{ for (i in 2:nr)
  { for (j in 1:(i-1))
if (identical(A[i,],A[j,])) # or something more complicated to handle 
incomparables
{ v[i] <- TRUE; break}
  }
}
v

Of course my code is horribly inefficient, but the difference should be just in 
computing the same result faster. An empty vector of some type is identical to 
an empty vector of the same type, so this computes

  [,1]

[1,] FALSE

[2,]  TRUE

[3,]  TRUE

[4,]  TRUE

[5,]  TRUE
, and I argue that that is correct.

A gap in documentation makes a change to the correct behaviour easier. (If the 
current behaviour were documented then the first step in changing the behaviour 
would be to issue a warning that the change is coming in a future version.) The 
protection for old code could be just a warning that can be turned off with a 
call to options. The new documentation should be more explicit.

Regards,
Jorgen.

From: Mark Webster 
To: Jorgen Harmse , Ivan Krylov

Cc: "r-help@r-project.org" 
Subject: Re: [R] duplicated() on zero-column data frames returns empty
Message-ID: <603481690.9150754.1712522666...@mail.yahoo.com>
Content-Type: text/plain; charset="utf-8"

 duplicated.matrix is an interesting one. I think a similar change would make 
sense, because it would have the dimensions that people would expect when using 
the default MARGIN = 1. However, it could be argued that it's not a needed 
change, because the Value section of its documentation only guarantees the 
dimensions of the output when using MARGIN = 0. In that case, duplicated.matrix 
does indeed return the expected 5x0 matrix for your example:
str(duplicated(matrix(0, 5, 0), MARGIN = 0))# logi[1:5, 0 ]
Best Regards,
Mark Webster
[[alternative HTML version deleted]]

From: Mark Webster markwebster...@yahoo.co.uk
To: Ivan Krylov ikry...@disroot.org,  
r-help@r-project.org
r-help@r-project.org
Subject: Re: [R]  duplicated() on zero-column data frames returns
empty vector
Message-ID: 
1379736116.7985600.1712306452...@mail.yahoo.com
Content-Type: text/plain; charset="utf-8"

 Do you mean the row names should mean all the rows should be counted as 
non-duplicates?Yes, I can see the argument for that, thanks.I must say I'm 
still puzzled at what interpretation would motivate the current behaviour of 
returning a logical(0), however.

Date: Sun, 7 Apr 2024 11:00:51 +0300
From: Ivan Krylov mailto:ikry...@disroot.org>>
To: Jorgen Harmse mailto:jhar...@roku.com>>
Cc: "r-help@r-project.org" 
mailto:r-help@r-project.org>>,
"markwebster...@yahoo.co.uk" 
mailto:markwebster...@yahoo.co.uk>>
Subject: Re: [R] duplicated() on zero-column data frames returns empty
Message-ID: 
20240407110051.7924c03c@Tarkus
Content-Type: text/plain; charset="utf-8"

� Fri, 5 Apr 2024 16:08:13 +
Jorgen Harmse mailto:jhar...@roku.com>> �:

> if duplicated really treated a row name as part of the row then
> any(duplicated(data.frame(�))) would always be FALSE. My expectation
> is that if key1 is a subset of key2 then all(duplicated(df[key1]) >=
> duplicated(df[key2])) should always be TRUE.

That's a good argument, thank you!

Would you suggest similar changes to duplicated.matrix too? Currently
it too returns 0-length output for 0-column inputs:

# 0-column matrix for 0-column input
str(duplicated(matrix(0, 5, 0)))
# logi[1:5, 0 ]

# 1-column matrix for 1-column input
str(duplicated(matrix(0, 5, 1)))
# logi [1:5, 1] FALSE TRUE TRUE TRUE TRUE

# a dim-1 array for >1-column input
str(duplicated(matrix(0, 5, 10)))
# logi [1:5(1d)] FALSE TRUE TRUE TRUE TRUE

--
Best regards,
Ivan




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: duplicated() on zero-column data frames returns empty

2024-05-13 Thread Jorgen Harmse via R-help
Good luck! It looks like a significant effort for someone not already on the 
team.

Regards,
Jorgen Harmse.

From: Mark Webster 
Date: Monday, May 13, 2024 at 04:07
To: Jorgen Harmse , Ivan Krylov 
Cc: r-help@r-project.org 
Subject: [EXTERNAL] Re: duplicated() on zero-column data frames returns empty
> If you would like to try your hand at developing a patch and make a
> case for it at R-devel or the Bugzilla, the resources at
>  can be helpful.

I am attempting to get admitted onto the Bugzilla at the moment for the data 
frame cases, fingers crossed!

Best Regards,
Mark Webster

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Listing folders on One Drive

2024-05-21 Thread Jorgen Harmse via R-help
I would just use
fi <- file.info(dir(path, recursive=TRUE, include.dirs=TRUE))
path could be the OneDrive directory or Scotland (and is not needed if you're 
already in the directory you want).
Then rownames(subset(fi, isdir)) will contain all the directories. Maybe you 
want to use grep or other machinery to thin it out.

Regards,
Jorgen Harmse.

--


Message: 1
Date: Mon, 20 May 2024 14:36:58 +0100
From: Nick Wray mailto:nickmw...@gmail.com>>
To: r-help@r-project.org 
Subject: [R] Listing folders on One Drive
Message-ID:
mailto:ds4we...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"


Hello I have lots of folders of individual Scottish river catchments on my
uni One Drive. Each folder is labelled with the river name eg "Tay" and
they are all in a folder named "Scotland"
I want to list the folders on One Drive so that I can cross check that I
have them all against a list of folders on my laptop.
Can I somehow use list.files() - I've tried various things but none seem to
work...
Any help appreciated
Thanks Nick Wray


[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep

2024-07-12 Thread Jorgen Harmse via R-help
which(grepl()) looks odd. Doesn't grep by itself return the correct vector 
of indices?

Regards,
Jorgen Harmse.


Message: 5
Date: Fri, 12 Jul 2024 17:42:05 +0800
From: Steven Yen mailto:st...@ntu.edu.tw>>
To: Uwe Ligges mailto:lig...@statistik.tu-dortmund.de>>, R-help Mailing List
mailto:r-help@r-project.org>>
Cc: Steven Yen mailto:sye...@gmail.com>>
Subject: Re: [R] grep
Message-ID: mailto:b73784ce-c018-4587-bcd9-64adbd0dc...@ntu.edu.tw>>
Content-Type: text/plain; charset="utf-8"


Sorry. grepl worked:


which(grepl("very|somewhat",names(goprobit.p$est)))




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] round and trailing zero

2024-07-30 Thread Jorgen Harmse via R-help
Duncan Murdoch answered your question, but I have another. Are you going to do 
some computation with the rounded numbers, or are they just for display? (One 
thing I like about Excel is that I can change the display format of a cell 
without changing answers that depend on that cell.) In the latter case, why 
stash them in a variable? For more control of the display, consider sprintf (or 
a wrapper that combines sprintf with cat).

Regards,
Jorgen Harmse.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] An error message with the command fm<-1m

2024-08-06 Thread Jorgen Harmse via R-help
> The function is lm(), not 1m().

Eric Berger is correct (except for the extra parentheses), but it is worth 
pointing out that variable names do not begin with digits. (You can use 
backticks, assign, & other features to create such names (e.g. to write the 
Orwellian assignment `2 + 2` <- 5L), but they are non-standard and you need 
special syntax to use them.) Maybe the font makes some characters hard to 
distinguish, but a vertical line at the start of a standard name must be 
lower-case 'l' or upper-case 'I', not '1' or a pipe symbol. A circle or oval 
must be 'o' or 'O', not the digit '0'. (Digits after the first character are 
standard.)

Regards,
Jorgen Harmse.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-06 Thread Jorgen Harmse via R-help
It seems to me that the documentation of R's complex class & R's atan function 
do not tell us what to expect, so (as others have suggested), some additional 
notes are needed. I think that mathematically atan(1i) should be NA_complex_, 
but R seems not to use any mathematically standard compactification of the 
complex plane (and I'm not sure that IEEE does either).

Incidentally, the signature of the complex constructor is confusing. 
complex(1L) returns zero, but complex(1L, argument=theta) is an element of the 
unit circle. The defaults suggest ambiguous results in case only length.out is 
specified, and you have to read a parenthesis in the details to figure out what 
will happen. Even then, the behaviour in my example is not spelled out 
(although it is suggested by negative inference). Moreover, the real & 
imaginary parts are ignored if either modulus or argument is provided, and I 
don't see that this is explained at all.

R's numeric (& IEEE's floating-point types) seem to approximate a multi-point 
compactification of the real line. +Inf & -Inf fill out the approximation to 
the extended real line, and NaN, NA_real_ & maybe some others handle some cases 
in which the answer does not live in the extended real line. (I'm not digging 
into bit patterns here. I suspect that there are several versions of NaN, but I 
hope that they all behave the same way.) The documentation suggests that a 
complex scalar in R is just a pair of numeric scalars, so we are not dealing 
with the Riemann sphere or any other usually-studied extension of the complex 
plane. Since R distinguishes various complex infinities (and seems to allow any 
combination of numeric values in real & imaginary parts), the usual 
mathematical answer for atan(1i) may no longer be relevant.

The tangent function has an essential singularity at complex infinity (the 
compactification point in the Riemann sphere, which I consider the natural 
extension for the study of meromorphic functions, for example making the 
tangent function well defined on the whole plane), so the usual extension of 
the plane does not give us an answer for atan(1i). However, another possible 
extension is the Cartesian square of the extended real line, and in that 
extension continuity suggests that tan(x + Inf*1i) = 1i and tan(x - Inf*1i) = 
-1i (for x real & finite). That is the result from R's tan function, and it 
explains why atan(1i) in R is not NA or NaN. The specific choice of pi/4 + 
Inf*1i puzzled me at first, but I think it's related to the branch-cut rules 
given in the documentation. The real part of atan((1+.Machine$double.eps)*1i) 
is pi/2, and the real part of atan((1-.Machine$double.eps)*1i) is zero, and 
someone apparently decided to average those for atan(1i).

TL;DR: The documentation needs more details, and I don't really like the 
extended complex plane that R implemented, but within that framework the 
answers for atan(1i) & atan(-1i) make sense.

Regards,
Jorgen Harmse.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] R-help Digest, Vol 260, Issue 19

2024-10-24 Thread Jorgen Harmse via R-help
I think that Stevie Pederson has the right idea, but it is not obvious what the 
threshold should be. Example:

> n <- 2428716; sum(rep(1/n,n)) - 1
[1] -3.297362e-14

I assume that equally large errors in the other direction are also possible.

Regards,
Jorgen Harmse.

--

Message: 1
Date: Wed, 23 Oct 2024 15:56:00 +1030
From: Stevie Pederson 
To: r-help@r-project.org
Subject: [R] OSX-specific Bug in randomForest
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hi,

It appears there is an OSX-specific bug in the function
`randomForest.default()` Going by the source code at
https://github.com/cran/randomForest/blob/master/R/randomForest.default.R
the bug is on line 103

If the vector `cutoff` is formed using `cutoff <- rep(1/9, 9)` (line #101)
the test on line 103 will fail on OSX as the sum is greater than 1 due to
machine precision errors.

sum(rep(1 / 9, 9)) - 1
# [1] 2.220446e-16

This will actually occur for a scenario when the number of factor levels
(nclass) is 9, 11, 18, 20 etc.The problem does not occur on Linux, and I
haven't tested on WIndows.

A suggestion may be to change the opening test

if (sum(cutoff) > 1 || ...)

to

if (sum(cutoff) - 1  > .Machine$double.eps || ...

however, I'm sure there's a more elegant way to do this

Thanks in advance

[[alternative HTML version deleted]]


***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting specific arguments from "..."

2025-01-06 Thread Jorgen Harmse via R-help
I think Bert Gunter is right, but do you want partial matches (not found by 
match), and how robust do you want the code to be?

f <- function(…)
{ pos <- match('a', ...names())
  if (is.na(pos))
stop("a is required.")
  …elt(pos)
}

Incidentally, what is the best way to extract the expression without evaluating 
it?



g <- function(...)

{ pos <- match('a',...names())

  if (is.na(pos))

stop("a is missing.")

  (function(a,...) substitute(a)) (...)

}

Regards,
Jorgen Harmse.

Message: 8
Date: Sun, 5 Jan 2025 11:17:02 -0800
From: Bert Gunter 
To: Iris Simmons 
Cc: R-help 
Subject: Re: [R] Extracting specific arguments from "..."
Message-ID:

Content-Type: text/plain; charset="utf-8"

Thanks, Iris.
That is what I suspected, but it wasn't clear to me from the docs.

Best,
Bert

On Sun, Jan 5, 2025 at 10:16 AM Iris Simmons  wrote:
>
> I would use two because it does not force the evaluation of the other 
> arguments in the ... list.
>
>
>
> On Sun, Jan 5, 2025, 13:00 Bert Gunter  wrote:
>>
>> Consider:
>>
>> f1 <- function(...){
>>   one <- list(...)[['a']]
>>   two <- ...elt(match('a', ...names()))
>>   c(one, two)
>> }
>> ## Here "..." is an argument list with "a" somewhere in it, but in an
>> unknown position.
>>
>> > f1(b=5, a = 2, c=7)
>> [1] 2 2
>>
>> Which is better for extracting a specific named argument, one<- or
>> two<- ?  Or a third alternative that is better than both?
>> Comments and critiques welcome.
>>
>> Cheers,
>> Bert
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting specific arguments from "..."

2025-01-07 Thread Jorgen Harmse via R-help
Interesting discussion. A few things occurred to me.

Apologies to Iris Simmons: I mixed up his answer with Bert's question.

Bert raises questions about promises, and I think they are related to John 
Sorkin's question. A big difference between R and most other languages is that 
function arguments are computed lazily. match.call & substitute tell us what 
expressions will be evaluated if function arguments are needed but not the 
environments in which that will happen. The usual suspects are environment() 
and parent.frame(), but parent.frame(k) & maybe even other environments are 
possible. If you are really determined then I guess you can keep evaluating 
match.call() in parent frames until you have accounted for all the inputs.

It's not clear to what extent John Sorkin is concerned about writing functions 
as opposed to using functions. Lazy computation has advantages but leads to 
some issues.
Exactly matching the function's default expression for an input is not 
necessarily the same as omitting the input. The evaluation environment is 
different.
If the caller uses an expression with side effects then there is no guarantee 
that the side effects will happen. If there are side effects from two or more 
inputs then the order is uncertain. (If an argument is not supplied and the 
default has side effects then they might not happen either. However, I don't 
know why the function writer would specify any side effect except stop(), and 
then he or she has probably arranged for it to happen exactly when it should.)
If a default value depends on another input and that input is modified inside 
the function then order of evaluation of inputs becomes important. Even if you 
know exactly what you're doing when you write the function, you should make it 
clear to future maintainers. An explicit call to force clarifies that the input 
needs to be computed with the existing values of anything that is used in the 
default, even if the code is refactored so that the value is not used 
immediately. If you really want to modify another input before evaluating the 
default then specify that in a comment.

Jeff Newmiller makes a good point. You can still change your mind about 
inspecting a particular input without breaking old code that uses your 
function, and you don�t necessarily need default values.

Old definition: f <- function(�) {}

New definition:
f <- function(�, a = )
{ 
  
}

OR

f <- function(�, a)
{ if (missing(a)) # OK, this becomes clunky if there are several such inputs
  { < pass � to another function >}
  else
 {  # Pitfall: Changing the order of evaluation may break 
old code, but then the design was probably too devious in the first place.

  }
  
}

Regards,
Jorgen Harmse.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD check says no visible binding for global variable

2025-01-29 Thread Jorgen Harmse via R-help

Hi Naresh Gurbuxani,

There are already several answers dealing with the specific code that you 
wrote, but my reaction is to step back a little.

R CMD � starts an R session but takes standard input from a file. (In Unix-like 
systems you might even be able to make an R script into an executable file.) I 
think it even uses the same .Rprofile as a regular R session. If there is a 
problem, you can paste the code a few lines at a time into a new ordinary R 
session. Then you can also set breakpoints (using debug, trace, or similar) in 
whatever functions you think caused the problem. You could also wrap all the 
code in a function and use the debugger to step through that (which may help if 
the error occurs in an iteration of a loop body).

Regards,
Jorgen Harmse.


Message: 1
Date: Mon, 27 Jan 2025 22:46:21 +
From: Naresh Gurbuxani 
To: "r-help@r-project.org" 
Subject: [R] R CMD check says no visible binding for global variable
Message-ID:



Content-Type: text/plain; charset="utf-8"

I have written a function which returns an SQL query result as a data.frame.  
Each column of data.frame is a variable not explicitly defined.

For every column name, R CMD check says �no visible binding for global variable 
. Status: 1 NOTE

Is it possible to tell R CMD check that these variables are OK?

Thanks,
Naresh

Sent from my iPhone



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What don't I understand about sample()?

2025-03-14 Thread Jorgen Harmse via R-help
I agree with the other answers. In particular, Bert Gunter points out that each 
argument to a function is evaluated at most once. Default arguments can use 
information in the callee's frame (and order of evaluation may matter), but 
arguments provided by the caller are evaluated in the caller's environment (or 
an ancestor in the call-stack hierarchy), so there is no way for sample to know 
that matrix prefers to see 50 values. If you are determined to have repeated 
evaluation (instead of simply telling sample what size you want) then you need 
a function that accepts an expression as input.

Regards,
Jorgen Harmse.

> arrayE <- function(E, dim)

+ { N <- prod(dim)

+   x <- numeric(0L)

+   while (length(x) arrayE(parse(text='sample(1:10, replace=TRUE)'), c(5,10))

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]

[1,]   107   10865691 9

[2,]633181298 5

[3,]4213957   101 2

[4,]171673361 2

[5,]968535345 1


--

Message: 2
Date: Thu, 13 Mar 2025 21:00:26 +
From: Kevin Zembower 
To: r-help@r-project.org 
Subject: [R] What don't I understand about sample()?
Message-ID:

<01000195914ef9c4-7adadf5d-0069-4794-af09-454452b71c3d-000...@email.amazonses.com>

Content-Type: text/plain; charset="utf-8"

Hello, all,

I'm learning to do randomized distributions in my Stats 101 class*. I
thought I could do it with a call to sample() inside a matrix(), like:

> matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]823182889 8
[2,]823182889 8
[3,]823182889 8
[4,]823182889 8
[5,]823182889 8
>

Imagine my surprise to learn that all the rows were the same
permutation. I thought each time sample() was called inside the matrix,
it would generate a different permutation.

I modeled this after the bootstrap sample techniques in
https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't
understand why it works in bootstrap samples (with replace=TRUE), but
not in randomized distributions (with replace=FALSE).

Thanks for any insight you can share with me, and any suggestions for
getting rows in a matrix with different permutations.

-Kevin

*No, this isn't a homework problem. We're using Lock5 as the text in
class, along with its StatKey web application. I'm just trying to get
more out of the class by also solving our problems using R, for which
I'm not receiving any class credit.


--

Message: 5
Date: Thu, 13 Mar 2025 14:33:40 -0700
From: Bert Gunter 
To: Kevin Zembower 
Cc: "r-help@r-project.org" 
Subject: Re: [R] What don't I understand about sample()?
Message-ID:

Content-Type: text/plain; charset="utf-8"

Bravo for your unrequired R efforts.

You misunderstand the nested call. sample() is called only once,
producing 1 sample of 10 with replacement. Since your matrix call
needs 50 values, ?matrix tells you (in details):
"If there are too few elements in data to fill the matrix, then the
elements in data are recycled. If data has length zero, NA of an
appropriate type is used for atomic vectors (0 for raw vectors) and
NULL for lists.

This sort of "recycling" is quite standard in R. Though not universal.

Cheers,
Bert

"An educated person is one who can entertain new ideas, entertain
others, and entertain herself."

On Thu, Mar 13, 2025 at 2:23 PM Kevin Zembower via R-help
 wrote:
>
> Hello, all,
>
> I'm learning to do randomized distributions in my Stats 101 class*. I
> thought I could do it with a call to sample() inside a matrix(), like:
>
> > matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,]823182889 8
> [2,]823182889 8
> [3,]823182889 8
> [4,]823182889 8
> [5,]823182889 8
> >
>
> Imagine my surprise to learn that all the rows were the same
> permutation. I thought each time sample() was called inside the matrix,
> it would generate a different permutation.
>
> I modeled this after the bootstrap sample techniques in
> https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't
> understand why it works in bootstrap samples (with replace=TRUE), but
> not in randomized distributions (with replace=FALSE).
>
> Thanks for any insight you can share with me, and any suggestions for
> getting rows in a matrix with different permutations.
>
> -Kevin
>
> *No, this isn't a homework problem. We're using Lock5 as the text in
> class, alon