What I would do:
# read in your sample data
mbr <- read.table( "clipboard", header = TRUE, stringsAsFactors = FALSE )
# create a vector with the codes you want to consider
code.list <- c("A","B","C","D","E")
# reduce the data accordingly
mbr <- mbr[ mbr$code %in% code.list, ]
# get your model matrix using reshape
library( reshape )
model.matrix <- as.data.frame( cast( melt( mbr ), value ~ code ) )
# Cosmetics
colnames( model.matrix )[1] <- "Member"
model.matrix[ 2 : ( length( model.matrix[1,] ) ) ] <-
ifelse( model.matrix[ 2 : ( length( model.matrix[1,] ) ) ] > 0, 1, 0 )
On Thursday 06 March 2014 19:23:03 Mckinstry, Craig wrote:
>
> I have a medical insurance claims datafile divided into blocks by member,
> with multiple lines per member. I am process these into a one line per member
> model matrix. Member block sizes vary from 1 to 50+. I am match attributes in
> claims data to columns in the model matrix and
>
> have been getting by with a for loop, but for large file size it takes much
> too long. Is there vectorized/apply based method to do this more efficiently?
>
> input data:
>
> member code
> 1 A
> 1 C
> 1 F
> 2 B
> 2 E
> 3 D
> 3 A
> 3 B
> 3 D
> 4 G
> 4 A
>
> code.list <- c(A,B,C,D,E)
> for(i in 1:n.mbr){
> mbr.i <- dat[dat$Rmbr==mbr.list[i],] #EXTRACT BLOCK OF MEMBER CLAIMS
> matrix.mat[i,unique(match(mbr.i$code,code.list))] <- 1
> }
>
>
> output model.matrix
> Member A B C D E
> 1 1 0 1 0 0
> 2 0 1 0 0 1
> 3 1 1 0 1 0
> 4 1 0 0 0 0
>
> Craig McKinstry
> 100 Market, 6th floor
> Office: 503-225-6878 | Cell: 509-778-2438
>
>
> IMPORTANT NOTICE: This communication, including any attachment, contains
> information that may be confidential or privileged, and is intended solely
> for the entity or individual to whom it is addressed. If you are not the
> intended recipient, you should delete this message and are hereby notified
> that any disclosure, copying, or distribution of this message is strictly
> prohibited. Nothing in this email, including any attachment, is intended to
> be a legally binding signature.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.