On Nov 13, 2009, at 2:32 PM, frenchcr wrote:
hello folks,
Im trying to clean out a large file with data i dont need.
The column im manipulating in the file is called "legal status"
Their are three kinds of rows i want to remove.
Those that have "Private", "Private (Op", or "Unknown" in the
legal_status
column.
I wrote this code but it syas im missing a TRUE/ False thingy...im
lost...heres the code...
Come on, "frenchcr". Just copy and post the damned error message.
cleanse <- function(a){
data1<-a
for (i in 1:dim(data1)[1])
{
if (data1[i,"
{
data1[i,"legal_status"]<-data1[-i,"legal_status"]
That will return every thing but one particular row
}
if (data1[i,""){
data1[i,"legal_status"]<-data1[-i,"legal_status"]
ditto
}
if (data1[i,""){
data1[i,"legal_status"]<-data1[-i,"legal_status"]
}
}
Makes for a lot of data.frame copying even if you hadn't sabotaged up
the registration of the indexing with the shrinking dataframe.
return(data1)
}
new_data<-cleanse(data)
new_data <- subset(data, legal_status != "Private" & legal_status !=
"Private(Op" & legal_status != "Unknown")
Or maybe:
"%not-in%" <- function(x, table) match(x, table, nomatch = 0) == 0
new_data <- subset(data, legal_status %not-in% c( "Private" ,
"Private(Op" , "Unknown") )
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.