Actually, scratch that, sorry!
I put the second part of your second solution code into a function and get the
right data frame in the end. So:
generate_unit<- function(unit) {
pid<- 1:unit$size
senior<- rep(0, unit$size)
senior[sample(unit$size, 2)] <- 1
return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
}
world<- function(n_units, unit_size){
units<- data.frame(id=1:n_units, size=unit_size)
library(plyr)
a<- ddply(units, .(id), generate_unit)
return(a)
}
and calling
world(n_units = 2, unit_size = 5)
gives me
id unit_id pid senior
1 1 1 1 1
2 1 1 2 0
3 1 1 3 1
4 1 1 4 0
5 1 1 5 0
6 2 2 1 1
7 2 2 2 0
8 2 2 3 1
9 2 2 4 0
10 2 2 5 0
Which is perfect! Sorry for jumping the gun and thanks again!
-Emma
----- Original Message -----
From: Emma Thomas <[email protected]>
To: Jan van der Laan <[email protected]>; "[email protected]"
<[email protected]>
Cc:
Sent: Wednesday, December 14, 2011 12:23 PM
Subject: Re: [R] Generating input population for microsimulation
Dear Jan,
Thanks for your reply.
The first solution works well for my needs for now, but I have a question about
the second. If I run your code and then call the function:
generate_unit(10)
I get an error that
Error in unit$size : $ operator is invalid for atomic vectors
Did you experience the same thing?
In any case, I will definitely take a look at the plyr package, which I'm sure
will be useful in the future.
Thanks again!
Emma
----- Original Message -----
From: Jan van der Laan <[email protected]>
To: "[email protected]" <[email protected]>
Cc: Emma Thomas <[email protected]>
Sent: Wednesday, December 14, 2011 6:18 AM
Subject: Re: [R] Generating input population for microsimulation
Emma,
If, as you say, each unit is the same you can just repeat the units to obtain
the required number of units. For example,
unit_size <- 10
n_units <- 10
unit_id <- rep(1:n_units, each=unit_size)
pid <- rep(1:unit_size, n_units)
senior <- ifelse(pid <= 2, 1, 0)
pop <- data.frame(unit_id, pid, senior)
If you want more flexibility in generating the units, I would first generate
the units (without the persons) and then generate the persons for each unit. In
the example below I use the plyr package; you could probably also use
lapply/sapply, or simply a loop over the units.
library(plyr)
generate_unit <- function(unit) {
pid <- 1:unit$size
senior <- rep(0, unit$size)
senior[sample(unit$size, 2)] <- 1
return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
}
units <- data.frame(id=1:n_units, size=unit_size)
library(plyr)
ddply(units, .(id), generate_unit)
HTH,
Jan
Emma Thomas <[email protected]> schreef:
> Hi all,
>
> I've been struggling with some code and was wondering if you all could help.
>
> I am trying to generate a theoretical population of P people who are housed
> within X different units. Each unit follows the same structure- 10 people per
> unit, 8 of whom are junior and two of whom are senior. I'd like to create a
> unit ID and a unique identifier for each person (person ID, PID) in the
> population so that I have a matrix that looks like:
>
> unit_id pid senior
> [1,] 1 1 0
> [2,] 1 2 0
> [3,] 1 3 0
> [4,] 1 4 0
> [5,] 1 5 0
> [6,] 1 6 0
> [7,] 1 7 0
> [8,] 1 8 0
> [9,] 1 9 1
> [10,] 1 10 1
> ...
>
> I came up with the following code, but am having some trouble getting it to
> populate my matrix the way I'd like.
>
> world <- function(units, pop_size, unit_size){
> pid <- rep(0,pop_size) #person ID
> senior <- rep(0,pop_size) #senior in charge
> unit_id <- rep(0,pop_size) #unit ID
>
> for (i in 1:pop_size){
> for (f in 1:units) {
> senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
> pid[i] = sample(c(1:10), 1, replace = FALSE)
> unit_id[i] <- f
> }}
> data <- cbind(unit_id, pid, senior)
>
> return(data)
> }
>
> world(units = 10,pop_size = 100, unit_size = 10) #call the function
>
>
>
> The output looks like:
> unit_id pid senior
> [1,] 10 7 0
> [2,] 10 4 0
> [3,] 10 10 0
> [4,] 10 9 1
> [5,] 10 10 0
> [6,] 10 1 1
> ...
>
> but what I really want is to generate is 10 different units with two seniors
> per unit, and with each person in the population having a unique identifier.
>
> I thought a nested for loop was one way to go about creating my data set of
> people and families, but obviously I'm doing something (or many things)
> wrong. Any suggestions on how to fix this? I had been focusing on creating a
> person and assigning them to a unit, but perhaps I should create the units
> and then populate the units with people?
>
> Thanks so much in advance.
>
> Emma
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.