Anthony28 wrote:
I need to use R to model a large number of experiments (say, 1000). Each
experiment involves the random selection of 5 numbers (without replacement)
from a pool of numbers ranging between 1 and 30.
What I need to know is what *proportion* of those experiments contains two
or more numbers that are consecutive. So, for instance, an experiment that
yielded the numbers 2, 28, 31, 4, 27 would be considered a "consecutive =
true" experiment since 28 and 27 are two consecutive numbers, even though
they are not side-by-side.
I am quite new to R, so really am puzzled as to how to go about this. I've
tried sorting each experiment, and then subtracting adjacent pairs of
numbers to see if the difference is plus or minus 1. I'm also unsure about
whether to use an array to store all the data first.
Any assistance would be much appreciated.
Vec <- c(2, 28, 31, 4, 27)
> Vec
[1] 2 28 31 4 27
# Sort the vector
> sort(Vec)
[1] 2 4 27 28 31
# Get differences between sequential elements
> diff(sort(Vec))
[1] 2 23 1 3
# Are any differences == 1?
> any(diff(sort(Vec)) == 1)
[1] TRUE
See ?sort, ?diff and ?any for more information
On your last question, if the data are all numeric and each experiment
contains 30 elements from which you select five, then you can store the
data in a N x 30 matrix, where N is the number of source data sets. The
result could be stored in a N x 5 matrix.
You can then run your test of sequential members as follows, presuming
'Res' contains the N x 5 result matrix:
prop.table(table(apply(Res, 1, function(x) any(diff(sort(x)) == 1)))
The output will be the proportion TRUE/FALSE of rows that have
sequential elements.
HTH,
Marc Schwartz
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.