Thanks Rui! Anybody with ideas regarding filling _while_ binding data frames instead of afterwards?
Ingmar 2012/8/22 Rui Barradas <ruipbarra...@sapo.pt> > Hello, > > Your function doesn't seem to be very difficult to generalize. > > d <- read.table(text=" > > trg_type child_type_1 > 1 Scientists NA > 2 of used > ", header=TRUE) > str(d) > > subs_na <- function(tok, na_factor_level = "NOT_REALIZED", na_num = 99999) > { > ifac <- which(sapply(tok, is.factor)) > inum <- which(sapply(tok, is.numeric)) > for(i in ifac) { > levels(tok[, i]) <- c(levels(tok[, i]), na_factor_level) > tok[is.na(tok[, i]), i] <- as.factor(na_factor_level) > } > for(i in inum) > tok[is.na(tok[, i]), i] <- na_num > tok > } > > r1 <- substitute_na(d) > r2 <- subs_na(d) > str(r1) > str(r2) > identical(r1, r2) # TRUE > > You could use the same coding for characters, Dates, etc. > > Hope this helps, > > Rui Barradas > > Em 22-08-2012 20:16, Ingmar Schuster escreveu: > > Hi, >> >> I have a data set with variables that are _not_ missing at random. Now I >> use a package for learning a Bayesian Network which won't accept NA as a >> value. From a database I query data.frames with k,k+n,k+2n, ... variables >> (there are always at least k variables as leftmost columns). Using >> rbind.fill from the reshape package on two data frames I would get a data >> frame like >> >> trg_type child_type_1 >> 1 Scientists NA >> 2 of used >> >> Now to get rid of NA values I use the following function, which works for >> data frames with only factor values: >> >> substitute_na <- function(tok, na_factor_level = "NOT_REALIZED") { >> for (i in 1:length(tok)) {levels(tok[,i]) <- c(levels(tok[,i]), >> na_factor_level)} >> tok[is.na(tok)] <- as.factor(na_factor_level) >> return(tok) >> } >> >> Is there a better/faster way to do it? It would also be great to be able >> to >> distinguish factor columns from numeric columns and use a special numeric >> value there. The current version of rbind.fill makes no direct reference >> to >> the fill value so that I could change its implementation for my purpose. >> >> >> Thanks! >> >> Ingmar >> >> > -- Ingmar Schuster Natural Language Processing Group Department of Computer Science University of Leipzig Johannisgasse 26 04103 Leipzig, Germany Tel. +49 341 9732205 http://asv.informatik.uni-leipzig.de/en/staff/Ingmar_Schuster [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.