Re: [R] Two envelopes problem

Mario Mon, 25 Aug 2008 16:42:28 -0700

Dear Greg,

The problem is that in your code you are creating a distribution wherethere are only 5-10 and 10-20 pairs. Yes, if I knew that there are onlythose two types of pairs, and if I new that the probability of each pairwas .50, .50, then it is advantageous to switch, but that is because Ihave a priori information on the distribution of the pairs. Now letsassume, I opened the envelope, and I see £20, should I switch? Well, nowit depends. Are you going to rewrite the simulation so that line 1within tmpfun reads x <- sample(c(10,20), 1)? otherwise is not going towork. The question is, if I see £20, according to my friend's argument,I should switch, since there is a 50% chance of seeing 40 and a 50%chance of seeing 10. However, in your simulation, £40's are never seen,so under your simulator, switching every time you see a £20 is a sureloss. You see, that's the problem. You don't know the distribution ofpairs, the fact that you've got a tenner note, does not give you anyadditional information. By the way, you could run your example with myown code (which is faster as I'm not using sample for the creation ofthe env pairs), just define the function


r5.10 <- function(n) return (sample(c(5,10), n, rep=T))

and now:

env <- generateenv(r=2, r5.10, n=1e6)
i10 <- which(env[,1]==10)
mean(env[i10,1]) # Exactly 10
mean(env[i10,2]) # ~ 12.50 you do get a definite advantage when switching

But this is an example that was tailored to work with the actual valueof £10.

system.time(env <- generateenv(r=2, r5.10, n=1e6)) # 0.500 0.0200.521system.time(replicate(1e6, tmpfun())) #38.211 0.148 38.364 which is about ~76 times slower


Cheers,
Mario.



Greg Snow wrote:

I think it is like the girl problem, it is a matter of what you condition on 
(condition on at least 1 girl, or on a specific one being a girl).

In your simulations (at least from what I can see), you generate an x from a 
distribution, randomize the order of x and 2x, then look at the average of env1 
and the average of env2, but you never condition on the information you get 
from env1, so you are answering a different problem.

Try this code:

tmpfun <- function() {
        x <- sample( c(5,10), 1 )
        x2 <- 2*x
        sample( c(x,x2) )
}

out1 <- replicate( 100000, tmpfun() )
mean( out1[2, out1[1,] == 10] )
sum(  out1[1,] == 10 )

Here we first generate the data such that one of the envelopes will have 10, 
then only look at the cases where env1 is 10, the average of env 2 in this case 
is about 12.5.

Now if we look at choosing env2 first:

mean( out1[1, out1[2,] == 10] )

We still have a mean of switching of about 12.5 (the paradox).

The first envelope gives us information which changes the problem.  You are 
looking at the problem from the start, without the information, not 
conditioning on that information.


Imagine a situation where you have 4 cards, 2 black and 2 red.  You shuffle the 
cards and randomly draw one out and place it face down in front of you (all 
backs are identical, probability of choosing each card is 1/4).  Sitting at the 
table are 3 of your friends, you show friend #1 that one of the cards still in 
your hand (chosen randomly) is black, you show friend #2 that one of the cards 
in your hand (chosen randomly) is red, friend #3 sees none of your cards and 
they do not share any information about what they see.  Now you ask them to 
each write down the probability that the face down card is black.  Friend #1 
will write down 1/3, #2 will write 2/3, and friend #3 will write 1/2.  They are 
all looking at the same face down card, but each has different information to 
condition on (and if we simulate each friends case, we will find that their 
probabilities are correct for their knowledge).

The 2 envelope case is a subset of a large problem.  Assume that there are n 
envelopes each with a different value in them (you know nothing about the 
distribution of the values).  You can open an envelope and see what is inside, 
then either keep that amount (and the game ends), or chose another envelope, 
but once you choose to open a new envelope you can never go back to a previous 
one.  The goal is to determine a strategy that will maximize your expected 
winings.  The best strategy that I have heard (I saw the proof once, but don't 
remember the details) is to open about 1/3 of the envelopes (actually I think 
it was 1/e or maybe 1/pi, but 1/3 is a good approximation to both of those), 
then continue opening envelopes until the first one that is greater than the 
maximum of the 1st 1/3 and stop there (or stop at the last envelope if the 
maximum was in that first 1/3).  As far as I know, nobody has come up with a 
better strategy yet (and some math people may have proven it is the best 
possible).  Without any knowledge you don't know if the first envelope is high 
or low, but by opening the 1st 1/3 you get information that helps form the rest 
of the strategy.  If n=2 (your 2 envelope case), then 1/3 of 2 rounds off to 1, 
select the first envelope, then go on to the next (which in this case also 
happens to be the last).  Always switching follows this same strategy.

It is paradoxical, but in real life (and in your simulations) we generally have 
more information than what is in the puzzle.






--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

-----Original Message-----
From: Mario [mailto:[EMAIL PROTECTED]
Sent: Monday, August 25, 2008 2:51 PM
To: Greg Snow; r-help@r-project.org
Subject: Re: [R] Two envelopes problem

No, no, no. I have solved the Monty Hall problem and the
Girl's problem and this is quite different. Imagine this, I
get the envelope and I open it and it has £A (A=10 or any
other amount it doesn't matter), a third friend gets the
other envelope, he opens it, it has £B, now £B could be
either £2A or £A/2. He doesn't know what I have, he doesn't
have any additional information. According to your logic, he
should switch, as he has a 50% chance of having £2B and 50%
chance of having £B/2. But the same logic applies to me. In
conclusion, its advantageous for both of us to switch. But
this is a paradox, if I'm expected to make a profit, then
surely he's expected to make a loss! This is why this problem
is so famous. If you look at the last lines of my simulation,
I get, conditional on the first envelope having had £10, that
the second envelope has £5 approximatedly 62.6% of the time
and 37.4% for the second envelope. In fact, it doesn't matter
what the original distribution of money in the envelopes is,
conditional on the first having £10, you should exactly see
2/3 of the second envelopes having £5 and 1/3 having £20. But
I'm getting a slight deviation from this ratio, which is
consistent, and I don't know why.

Cheers,
Mario.

Greg Snow wrote:

You are simulating the answer to a different question.

Once you know that one envelope contains 10, then you know

conditional on that information that either x=10 and the
other envelope holds 20, or 2*x=10 and the other envelope
holds 5.  With no additional information and assuming random
choice we can say that there is a 50% chance of each of
those.  A simple simulation (or the math) shows:

tmp <- sample( c(5,20), 100000, replace=TRUE )
mean(tmp)

[1] 12.5123

Which is pretty close to the math answer of 12.5.

If you have additional information (you believe it unlikely

that there would be 20 in one of the envelopes, the envelope
you opened has 15 in it and the other envelope can't have 7.5
(because you know there are no coins and there is no such
thing as a .5 bill in the local currency), etc.) then that
will change the probabilities, but the puzzle says you have
no additional information.

Your friend is correct in that switching is the better strategy.

Another similar puzzle that a lot of people get confused over is:

"I have 2 children, one of them is a girl, what is the

probability that the other is also a girl?"

Or even the classic Monty Hall problem (which has many

answers depending on the motivation of Monty).

Hope this helps,

(p.s., the above children puzzle is how I heard the puzzle,

I actually have 4 children (but the 1st 2 are girls, so it
was accurate for me for a while).

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mario
Sent: Monday, August 25, 2008 1:41 PM
To: r-help@r-project.org
Subject: [R] Two envelopes problem

A friend of mine came to me with the two envelopes

problem, I hadn't

heard of this problem before and it goes like this:
someone puts an amount `x' in an envelope and an amount `2x'
in another. You choose one envelope randomly, you open it,

and there

are inside, say £10. Now, should you keep the £10 or swap

envelopes

and keep whatever is inside the other envelope? I told my

friend that

swapping is irrelevant since your expected earnings are

1.5x whether

you swap or not. He said that you should swap, since if

you have £10

in your hands, then there's a 50% chance of the other

envelope having

£20 and 5% chance of it having £5, so your expected earnings are
£12.5 which is more than £10 justifying the swap. I told my friend
that he was talking non-sense. I then proceeded to write a

simple R

script (below) to simulate random money in the envelopes and it
convinced me that the expected earnings are simply
1.5 * E(x) where E(x) is the expected value of x, a random

variable

whose distribution can be set arbitrarily. I later found out that
this is quite an old and well understood problem, so I got

back to my

friend to explain to him why he was wrong, and then he

insisted that

in the definition of the problem he specifically said that you
happened to have £10 and no other values, so is still

better to swap.

I thought that it would be simply to prove in my

simulation that from

those instances in which £10 happened to be the value seen in the
first envelope, then the expected value in the second

envelope would

still be £10. I run the simulation and surprisingly, I'm getting a
very slight edge when I swap, contrary to my intuition. I think
something in my code might be wrong. I have attached it below for
whoever wants to play with it. I'd be grateful for any feedback.

# Envelopes simulation:
#
# There are two envelopes, one has certain amount of money

`x', and

the other an # amount `r*x', where `r' is a positive constant
(usaully r=2 or r=0.5).
You are
# allowed to choose one of the envelopes and open it.

After you know

the amount # of money inside the envelope you are given

two options:

keep the money from # the current envelope or switch envelopes and
keep the money from the second # envelope. What's the best

strategy?

To switch or not to switch?
#
# Naive explanation: imagine r=2, then you should switch

since there

is a 50% # chance for the other envelope having 2x and 50% of it
having x/2, then your # expected earnings are E = 0.5*2x +

0.5x/2 =

1.25x, since 1.25x > x you # should switch! But, is this

explanation

right?
#
# August 2008, Mario dos Reis

# Function to generate the envelopes and their money # r:
constant, so that x is the amount of money in one envelop

and r*x is

the
#    amount of money in the second envelope
# rdist: a random distribution for the amount x # n: number of
envelope pairs to generate # ...: additional parameters for the
random distribution # The function returns a 2xn matrix containing
the (randomized) pairs # of envelopes generateenv <- function (r,
rdist, n, ...) {
  env <- matrix(0, ncol=2, nrow=n)
  env[,1] <- rdist(n, ...)  # first envelope has `x'
  env[,2] <- r*env[,1]      # second envelope has `r*x'

  # randomize de envelopes, so we don't know which one from
  # the pair has `x' or `r*x'
  i <- as.logical(rbinom(n, 1, 0.5))
  renv <- env
  renv[i,1] <- env[i,2]
  renv[i,2] <- env[i,1]

  return(renv)  # return the randomized envelopes }

# example, `x' follows an exponential distribution with

E(x) = 10 #

we do one million simulations n=1e6) env <- generateenv(r=2, rexp,
n=1e6, rate=1/10)
mean(env[,1]) # you keep the randomly assigned first envelope
mean(env[,2]) # you always switch and keep the second

# example, `x' follows a gamma distributin, r=0.5 env <-
generateenv(r=.5, rgamma, n=1e6, shape=1, rate=1/20)
mean(env[,1]) # you keep the randomly assigned first envelope
mean(env[,2]) # you always switch and keep the second

# example, a positive 'normal' distribution # First write your won
function:
rposnorm <- function (n, ...)
{
  return(abs(rnorm(n, ...)))
}
env <- generateenv(r=2, rposnorm, n=1e6, mean=20, sd=10)
mean(env[,1]) # you keep the randomly assigned first envelope
mean(env[,2]) # you always switch and keep the second

# example, exponential approximated as an integer rintexp <-
function(n, ...) return (ceiling(rexp(n, ...))) # we use

ceiling as

we don't want zeroes env <- generateenv(r=2, rintexp, n=1e6,
rate=1/10)
mean(env[,1]) # you keep the randomly assigned first envelope
mean(env[,2]) # you always switch and keep the second i10 <-
which(env[,1]==10)
mean(env[i10,1]) # Exactly 10
mean(env[i10,2]) # ~ 10.58 - 10.69 after several trials

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Two envelopes problem

Reply via email to