On Jul 19, 2009, at 3:11 PM, Hadassa Brunschwig wrote:

Hi

I am not sure what you mean by sampling an index of a group of
intervals. I will try to give an example:

If you had a dataframe of the following sort:

dfint
start stop
3       7
12      20
40      45
60      72

And you wanted to generate a set of 100 samples with equal probability of occurring in any one of those intervals, you might sample first on the index of the intervals:

idx=sample(1:4, 100, replace=TRUE)
<some sort of appropriate iterative construct>
 ### and then sample within the intervals.
sample((dfint[idx,1]):(dfint[idx,2]), 1)
<end loop>


Let's assume I have a vector 1:1000000. Let's say I have 10 intervals
of different but known length, say,
c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
those 10 intervals 1000 times.
The requirement is, however, that they should be of those lengths and
should not be overlapping.
In short, I would like to obtain a 10x1000 matrix with sampled intervals.

I am trying to understand how the vector 1:1000000 relates to the intervals. What do you mean by "sample the 10 intervals"? For one thing what you have offered are not intervals at all. Are you saying (in part at least) that you want equal probabilities of a sampled element to come from each of the intervals, however, they might be defined? Or do you want the sampling probabilities to vary from interval to interval.

I say again, ... a concrete example could do marvels in communicating your goals.

Do



Thanks
Hadassa

On Sun, Jul 19, 2009 at 9:48 PM, David Winsemius<dwinsem...@comcast.net > wrote:

On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:

Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n
non-overlapping
intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.

It might be ridiculously easy if you sampled on an index of a group of
intervals.
Why not pose the question in the form of example data.frames or other
classes of R objects? Specification of the desired output would be
essential. I think further specification of the sampling strategy would also help because I am unable to understand what sort of probability model you
are hoping to apply.

Any suggestions on that?

Thanks a lot.

Hadassa


--
Hadassa Brunschwig
PhD Student
Department of Statistics


David Winsemius, MD
Heritage Laboratories
West Hartford, CT





--
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to