On Sat, Feb 19, 2011 at 11:30 AM, Ted Harding <ted.hard...@wlandres.net> wrote: > On 19-Feb-11 14:48:53, David Winsemius wrote: >> On Feb 19, 2011, at 5:47 AM, danielepippo wrote: >>> Hi, >>> I have two vector with the marginal distribution like this: >>>> a >>> [1] -0.419 -0.364 -0.159 -0.046 -0.010 -0.002 0.000 0.000 0.000 >>>> b >>> [1] 0.125 0.260 0.270 0.187 0.097 0.041 0.014 0.004 0.001 >>> >>> How can I calculate the joint distribution with R? >> >> Marginal distributions do not uniquely determine the joint >> distribution, Furthermore, the first one doesn't even look like a >> distribution or even a density. Densities are positive and CDFs range >> from 0 to 1. (The second on could be a discrete density for some set >> of values.) So you need to explain where those numbers come from and >> why you think we should apply sort of "distributional" assumptions >> about them. >> -- >> David Winsemius, MD >> West Hartford, CT > > Following up on David's reply (I was in the midst of thinking about > this when he responded): > > I strongly suspect that the values of 'a' have had their signs > changed. If we simply make them all positive: > > a <- c(0.419,0.364,0.159,0.046,0.010,0.002,0.000,0.000,0.000) > > then now sum(a) = 1 and these can be probabilities. Also, with > the values given, sum(b) = 0.999; so (in the relatively least > disturbing way) I shall increase the largest value by 0.001: > > b <- c(0.125,0.260,0.271,0.187,0.097,0.041,0.014,0.004,0.001) > > (0.270 -> 0.271). Now also sum(b) = 1. > > Now, as a somewhat trivial example to illustrate the non-uniqueness > that David points out, I shall invent a very arbitrary set of > probabilities P = {P[i,j]} such that they have the given marginal > distributions. > > Since we have the marginals given to 3 decimal places, now consider > allocating 1000 pairs of random numbers (x,y). Then we pick out the > smallest 419 of the x's to go into the bottom class of 'a' > (proportion = 419/1000 = 0.419), then the 364 next smallest for the > second class of 'a', and so on. And similarly for the numbers 'y' > to go into the classes of 'b'. > > Then the inequalities which define these marginal divisions can be > applied jointly to produce the joint divisions. In the first instance > the results will be computed as counts > > Therefore: > > X0 <- runif(1000) ; X<- sort(X0) > Y0 <- runif(1000) ; Y<- sort(Y0) > > A <- cumsum(c(419,364,159, 46, 10, 2, 0, 0, 0)) > B <- cumsum(c(125,260,271,187, 97, 41, 14, 4, 1)) > > Xdiv <- numeric(8) > Ydiv <- numeric(8) > > for(i in (1:8)){ > Xdiv[i] <- 0.5*(max(X[1:A[i]])+min(X[(A[i]+1):A[i+1]])) > Ydiv[i] <- 0.5*(max(Y[1:B[i]])+min(Y[(B[i]+1):B[i+1]])) > Xdiv[is.na(Xdiv)] <- 1.0 > Ydiv[is.na(Ydiv)] <- 1.0 > } > Xdiv <-c(0,Xdiv,1) > Ydiv <- c(0,Ydiv,1) > > N <- matrix(0,nrow=9,ncol=9) > > for(i in (1:9)){ > for(j in (1:9)){ > N[i,j] <- sum((Xdiv[i]<X0)&(X0<=Xdiv[i+1])& > (Ydiv[j]<Y0)&(Y0<=Ydiv[j+1])) > } > } > > N > rowSums(N) > colSums(N) > > Here is one typical result (depending on the random numbers) for N: > > N > # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] > # [1,] 54 113 112 76 36 20 7 1 0 > # [2,] 44 90 101 75 35 13 3 2 1 > # [3,] 20 40 42 28 21 6 2 0 0 > # [4,] 5 13 13 6 5 2 1 1 0 > # [5,] 2 3 3 1 0 0 1 0 0 > # [6,] 0 1 0 1 0 0 0 0 0 > # [7,] 0 0 0 0 0 0 0 0 0 > # [8,] 0 0 0 0 0 0 0 0 0 > # [9,] 0 0 0 0 0 0 0 0 0 > rowSums(N) > # [1] 419 364 159 46 10 2 0 0 0 ## Agreeing with (A) > colSums(N) > # [1] 125 260 271 187 97 41 14 4 1 ## Agreeing with (B) > > And here is another: > > # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] > # [1,] 52 109 113 89 35 15 3 3 0 > # [2,] 45 95 108 50 40 16 9 0 1 > # [3,] 22 44 33 33 16 9 1 1 0 > # [4,] 4 9 14 13 4 1 1 0 0 > # [5,] 2 3 3 1 1 0 0 0 0 > # [6,] 0 0 0 1 1 0 0 0 0 > # [7,] 0 0 0 0 0 0 0 0 0 > # [8,] 0 0 0 0 0 0 0 0 0 > # [9,] 0 0 0 0 0 0 0 0 0 > rowSums(N) > # [1] 419 364 159 46 10 2 0 0 0 > colSums(N) > # [1] 125 260 271 187 97 41 14 4 1 > > Then N/1000 gives the joint probability distribution, and > rowSums(N) and colSums(N) give the marginal distributions.
Here is another way to demonstrate two different joint probability distributions with the same marginals using R: set.seed(123) a <- c(-0.419, -0.364, -0.159, -0.046, -0.01, -0.002, 0, 0, 0) b <- c(0.125, 0.26, 0.27, 0.187, 0.097, 0.041, 0.014, 0.004, 0.001) # tabs will be a list of two tables with same marginals tabs <- lapply(r2dtable(2, 1000 * a / sum(a), 1000 * b / sum(b)), "/", 1000); tabs # rowtotals are the same for both tables lapply(tabs, rowSums) # column totals are the same for both tables lapply(tabs, colSums) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.