Re: [R] What don't I understand about sample()?

Jorgen Harmse via R-help Fri, 14 Mar 2025 09:31:55 -0700

I agree with the other answers. In particular, Bert Gunter points out that each 
argument to a function is evaluated at most once. Default arguments can use 
information in the callee's frame (and order of evaluation may matter), but 
arguments provided by the caller are evaluated in the caller's environment (or 
an ancestor in the call-stack hierarchy), so there is no way for sample to know 
that matrix prefers to see 50 values. If you are determined to have repeated 
evaluation (instead of simply telling sample what size you want) then you need 
a function that accepts an expression as input.

Regards,
Jorgen Harmse.

> arrayE <- function(E, dim)

+ { N <- prod(dim)

+   x <- numeric(0L)

+   while (length(x)<N)

+     x <- c(x, eval(E, parent.frame()))

+   array(x[1:N], dim=dim)

+ }

> arrayE(parse(text='sample(1:10, replace=TRUE)'), c(5,10))

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]

[1,]   10    7   10    8    6    5    6    9    1     9

[2,]    6    3    3    1    8    1    2    9    8     5

[3,]    4    2    1    3    9    5    7   10    1     2

[4,]    1    7    1    6    7    3    3    6    1     2

[5,]    9    6    8    5    3    5    3    4    5     1

------------------------------

Message: 2
Date: Thu, 13 Mar 2025 21:00:26 +0000
From: Kevin Zembower <ke...@zembower.org>
To: r-help@r-project.org <r-help@r-project.org>
Subject: [R] What don't I understand about sample()?
Message-ID:

<01000195914ef9c4-7adadf5d-0069-4794-af09-454452b71c3d-000...@email.amazonses.com>

Content-Type: text/plain; charset="utf-8"

Hello, all,

I'm learning to do randomized distributions in my Stats 101 class*. I
thought I could do it with a call to sample() inside a matrix(), like:

> matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    8    2    3    1    8    2    8    8    9     8
[2,]    8    2    3    1    8    2    8    8    9     8
[3,]    8    2    3    1    8    2    8    8    9     8
[4,]    8    2    3    1    8    2    8    8    9     8
[5,]    8    2    3    1    8    2    8    8    9     8
>

Imagine my surprise to learn that all the rows were the same
permutation. I thought each time sample() was called inside the matrix,
it would generate a different permutation.

I modeled this after the bootstrap sample techniques in
https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't
understand why it works in bootstrap samples (with replace=TRUE), but
not in randomized distributions (with replace=FALSE).

Thanks for any insight you can share with me, and any suggestions for
getting rows in a matrix with different permutations.

-Kevin

*No, this isn't a homework problem. We're using Lock5 as the text in
class, along with its StatKey web application. I'm just trying to get
more out of the class by also solving our problems using R, for which
I'm not receiving any class credit.

------------------------------

Message: 5
Date: Thu, 13 Mar 2025 14:33:40 -0700
From: Bert Gunter <bgunter.4...@gmail.com>
To: Kevin Zembower <ke...@zembower.org>
Cc: "r-help@r-project.org" <r-help@r-project.org>
Subject: Re: [R] What don't I understand about sample()?
Message-ID:
        <CAGxFJbTKagSSs=t7vnhdjsj+rtdtkkud8e5q2f1chwjc_f9...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Bravo for your unrequired R efforts.

You misunderstand the nested call. sample() is called only once,
producing 1 sample of 10 with replacement. Since your matrix call
needs 50 values, ?matrix tells you (in details):
"If there are too few elements in data to fill the matrix, then the
elements in data are recycled. If data has length zero, NA of an
appropriate type is used for atomic vectors (0 for raw vectors) and
NULL for lists.

This sort of "recycling" is quite standard in R. Though not universal.

Cheers,
Bert

"An educated person is one who can entertain new ideas, entertain
others, and entertain herself."

On Thu, Mar 13, 2025 at 2:23 PM Kevin Zembower via R-help
<r-help@r-project.org> wrote:
>
> Hello, all,
>
> I'm learning to do randomized distributions in my Stats 101 class*. I
> thought I could do it with a call to sample() inside a matrix(), like:
>
> > matrix(sample(1:10, replace=TRUE), 5, 10, byrow=TRUE)
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,]    8    2    3    1    8    2    8    8    9     8
> [2,]    8    2    3    1    8    2    8    8    9     8
> [3,]    8    2    3    1    8    2    8    8    9     8
> [4,]    8    2    3    1    8    2    8    8    9     8
> [5,]    8    2    3    1    8    2    8    8    9     8
> >
>
> Imagine my surprise to learn that all the rows were the same
> permutation. I thought each time sample() was called inside the matrix,
> it would generate a different permutation.
>
> I modeled this after the bootstrap sample techniques in
> https://pages.stat.wisc.edu/~larget/stat302/chap3.pdf. I don't
> understand why it works in bootstrap samples (with replace=TRUE), but
> not in randomized distributions (with replace=FALSE).
>
> Thanks for any insight you can share with me, and any suggestions for
> getting rows in a matrix with different permutations.
>
> -Kevin
>
> *No, this isn't a homework problem. We're using Lock5 as the text in
> class, along with its StatKey web application. I'm just trying to get
> more out of the class by also solving our problems using R, for which
> I'm not receiving any class credit.
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What don't I understand about sample()?

Reply via email to