N <- 8 # however many times you want to do this
ans <- lapply( seq.int( N )
, function( n ) {
idx <- sample( nrow( mydat ) )
mydat[ idx[ seq.int( which( 40 < cumsum( mydat[ idx, "count" ]
) )[ 1 ] ) ], ]
}
)
On Mon, 11 Feb 2019, Val wrote:
Sorry Jeff and David for not being clear!
The total sample size should be at least 40, but the selection should
be based on group ID. A different combination of Group ID could give
at least 40.
If I select group G1 with 25 count and G2 and with 15 counts
then I can get a minimum of 40 counts. So G1 and G2 are
selected.
G1 25
G2 15
In another scenario, if G2, G3 and G4 are selected then the total
count will be 58 which is greater than 40. So G2 , G3 and G4 could
be selected.
G2 15
G3 12
G4 31
So the restriction is to find group IDs that give a minim of 40.
Once, I reached a minim of 40 then stop selecting group and output
the data..
I am hope this helps
On Mon, Feb 11, 2019 at 5:09 PM Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote:
This constraint was not clear in your original sample data set. Can you expand
the data set to clarify how this requirement REALLY works?
On February 11, 2019 3:00:15 PM PST, Val <valkr...@gmail.com> wrote:
Thank you David.
However, this will not work for me. If the group ID selected then all
of its observation should be included.
On Mon, Feb 11, 2019 at 4:51 PM David L Carlson <dcarl...@tamu.edu>
wrote:
First expand your data frame into a vector where G1 is repeated 25
times, G2 is repeated 15 times, etc. Then draw random samples of 40
from that vector:
grp <- rep(mydat$group, mydat$count)
grp.sam <- sample(grp, 40)
table(grp.sam)
grp.sam
G1 G2 G3 G4 G5
10 9 5 13 3
----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352
-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Val
Sent: Monday, February 11, 2019 4:36 PM
To: r-help@R-project.org (r-help@r-project.org)
<r-help@r-project.org>
Subject: [R] Select
Hi all,
I have a data frame with tow variables group and its size.
mydat<- read.table( text='group count
G1 25
G2 15
G3 12
G4 31
G5 10' , header = TRUE, as.is = TRUE )
I want to select group ID randomly (without replacement) until
the
sum of count reaches 40.
So, in the first case, the data frame could be
G4 31
65 10
In other case, it could be
G5 10
G2 15
G3 12
How do I put sum of count variable is a minimum of 40 restriction?
Than k you in advance
I want to select group ids randomly until I reach the
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.