Hi Steve, Here is a suggestion using your original df1: # Create a copy -- you can avoid this newdf1 <- df1
# Process newdf1[,2:4] <- apply(newdf1[,2:4], 2, function(x) as.numeric(x)) # Removing df1 rm(df1) # Result newdf1 # str() str(newdf1) # 'data.frame': 18 obs. of 4 variables: # $ site: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 2 2 2 2 ... # $ v1 : num 10 22 44 521 5 ... # $ v2 : num 5 54 214 14 73 0.4 1 4 NA 4 ... # $ v3 : num NA NA 2 4 1 4 NA 5 4 1 ... HTH, Jorge On Wed, Sep 16, 2009 at 1:50 PM, Steve Hong <> wrote: > Dear all, > > I have partial data set with four colums. First column is "site" with > three > factors (i.e., A, B, and C). From second to fourth columns (v1 ~ v3) are > my > observations. In the observations of the data set, "." indicates missing > value. I replaced "." with NA. To replace "." with NA, I used two steps. > First, I replaced "." with NA, and then, changed each variable from factor > to numeric using "df1$v1 <- as.numeric(df1$v1)". The second step was OK > when I have low numbers of variables, however, it is painful when I have a > lot of variables. > > My question is: Is there any much more efficient way to convert this kind > of > large scale data? In short, I am looking for an alternative way of STEP 2. > Or whole procedure if there is. > > Any comment will be highly appreciated. > > Thank you in advance!! > > Steve > > P.S.: Below is an example of what I did. > > STEP 1 > > df1 > site v1 v2 v3 > 1 A 10 5 . > 2 A 22 54 . > 3 A 44 214 2 > 4 A 521 14 4 > 5 A 5 73 1 > 6 A 1654 0.4 4 > 7 B 16 1 . > 8 B . 4 5 > 9 B . . 4 > 10 B . 4 1 > 11 B 51 . 2 > 12 B 5 . . > 13 C 1 0.4 . > 14 C 0 4 . > 15 C 1 1 4 > 16 C 40 . 7 > 17 C 4 . 7 > 18 C 10 . 1 > > str(df1) > 'data.frame': 18 obs. of 4 variables: > $ site: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 2 2 2 2 ... > $ v1 : Factor w/ 13 levels ".","0","1","10",..: 4 7 10 13 11 6 5 1 1 1 > ... > $ v2 : Factor w/ 9 levels ".","0.4","1",..: 7 8 5 4 9 2 3 6 1 6 ... > $ v3 : Factor w/ 6 levels ".","1","2","4",..: 1 1 3 4 2 4 1 5 4 2 ... > > df1[df1=="."] <- "NA" > Warning messages: > 1: In `[<-.factor`(`*tmp*`, thisvar, value = "NA") : > invalid factor level, NAs generated > 2: In `[<-.factor`(`*tmp*`, thisvar, value = "NA") : > invalid factor level, NAs generated > 3: In `[<-.factor`(`*tmp*`, thisvar, value = "NA") : > invalid factor level, NAs generated > > df1 > site v1 v2 v3 > 1 A 10 5 <NA> > 2 A 22 54 <NA> > 3 A 44 214 2 > 4 A 521 14 4 > 5 A 5 73 1 > 6 A 1654 0.4 4 > 7 B 16 1 <NA> > 8 B <NA> 4 5 > 9 B <NA> <NA> 4 > 10 B <NA> 4 1 > 11 B 51 <NA> 2 > 12 B 5 <NA> <NA> > 13 C 1 0.4 <NA> > 14 C 0 4 <NA> > 15 C 1 1 4 > 16 C 40 <NA> 7 > 17 C 4 <NA> 7 > 18 C 10 <NA> 1 > > str(df1) > 'data.frame': 18 obs. of 4 variables: > $ site: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 2 2 2 2 ... > $ v1 : Factor w/ 13 levels ".","0","1","10",..: 4 7 10 13 11 6 5 NA NA NA > ... > $ v2 : Factor w/ 9 levels ".","0.4","1",..: 7 8 5 4 9 2 3 6 NA 6 ... > $ v3 : Factor w/ 6 levels ".","1","2","4",..: NA NA 3 4 2 4 NA 5 4 2 ... > > STEP 2. > > > df1$v1 <- as.numeric(df1$v1) > > df1$v2 <- as.numeric(df1$v2) > > df1$v3 <- as.numeric(df1$v3) > > df1 > site v1 v2 v3 > 1 A 4 7 NA > 2 A 7 8 NA > 3 A 10 5 3 > 4 A 13 4 4 > 5 A 11 9 2 > 6 A 6 2 4 > 7 B 5 3 NA > 8 B NA 6 5 > 9 B NA NA 4 > 10 B NA 6 2 > 11 B 12 NA 3 > 12 B 11 NA NA > 13 C 3 2 NA > 14 C 2 6 NA > 15 C 3 3 4 > 16 C 9 NA 6 > 17 C 8 NA 6 > 18 C 4 NA 2 > > str(df1) > 'data.frame': 18 obs. of 4 variables: > $ site: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 2 2 2 2 ... > $ v1 : num 4 7 10 13 11 6 5 NA NA NA ... > $ v2 : num 7 8 5 4 9 2 3 6 NA 6 ... > $ v3 : num NA NA 3 4 2 4 NA 5 4 2 ... > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.