On May 2, 2013, at 11:00 PM, jpm miao wrote: > Hi Anthony, > > Thank you very much. It works very well. However, after this line > >> temp <- sapply( temp , as.numeric ) > > the data becomes a series of numbers instead of a matrix. Is there any > way to keep it a matrix?
Perhaps (assuming this were a data.frame to be coerced: temp <- matrix( sapply( temp , as.numeric ), dim(temp)[1]) But the persistence of the "-"'s is puzzling. You should (as always) have posted the output from dput(temp). Thanks, > > Miao > > > > >> temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE, > startRow=2, endRow= 11, startCol=2, endCol=5) >> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) ) >> temp > Col1 Col2 Col3 Col4 > [1,] "647853" "1413" "57662" "27897" > [2,] "491400" "1365" "40919" "20411" > [3,] "38604" "-" "5505" "985" > [4,] "576" "-" "20" "54" > [5,] "80845" "21" "10211" "4494" > [6,] "36428" "27" "1007" "1953" > [7,] "269915" "587" "32988" "12779" > [8,] "224494" "-" "30554" "9184" > [9,] "11858" "587" "-" "686" > [10,] "3742" "-" "81" "415" >> temp <- sapply( temp , as.numeric ) > Warning messages: > 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion > 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion > 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion > 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion > 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion >> temp > 647853 491400 38604 576 80845 36428 269915 > 647853 491400 38604 576 80845 36428 269915 > 224494 11858 3742 1413 1365 - - > 224494 11858 3742 1413 1365 NA NA > 21 27 587 - 587 - 57662 > 21 27 587 NA 587 NA 57662 > 40919 5505 20 10211 1007 32988 30554 > 40919 5505 20 10211 1007 32988 30554 > - 81 27897 20411 985 54 4494 > NA 81 27897 20411 985 54 4494 > 1953 12779 9184 686 415 > 1953 12779 9184 686 415 >> temp[ is.na( temp ) ] <- 0 >> temp > 647853 491400 38604 576 80845 36428 269915 > 647853 491400 38604 576 80845 36428 269915 > 224494 11858 3742 1413 1365 - - > 224494 11858 3742 1413 1365 0 0 > 21 27 587 - 587 - 57662 > 21 27 587 0 587 0 57662 > 40919 5505 20 10211 1007 32988 30554 > 40919 5505 20 10211 1007 32988 30554 > - 81 27897 20411 985 54 4494 > 0 81 27897 20411 985 54 4494 > 1953 12779 9184 686 415 > 1953 12779 9184 686 415 > > > 2013/5/2 Anthony Damico <ajdam...@gmail.com> > >> try adding colTypes = 'numeric' to your readWorkSheetFromFile() call >> >> >> >> if that doesn't work, try a few other steps >> >> >> # view what data types your file is being read in as >> sapply( temp , class ) >> >> >> # convert all fields to character if they're factor variables.. but i >> don't think you need this, readWorksheet defaults to `character` >> temp <- sapply( temp , as.character ) >> >> >> # you can also convert a subset like this >> temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) >> >> >> >> # remove commas from character strings >> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) ) >> >> # convert all fields to numeric >> temp <- sapply( temp , as.numeric ) >> >> # convert all NA fields to zeroes if you prefer >> temp[ is.na( temp ) ] <- 0 >> >> >> >> >> >> On Wed, May 1, 2013 at 11:55 PM, jpm miao <miao...@gmail.com> wrote: >> >>> Hi, >>> >>> Attached are two datasheet to be read. >>> My raw data "130502temp.xlsx" contains numbers with ' symbols, and they >>> can't be read as numbers. Even if I copy and paste as numbers to form a >>> new >>> file "130502temp_number1.xlsx", they could not be read smoothly. >>> >>> 1. How can I read the datasheet as numbers? >>> 2. How can I treat the notation "-" as (1) "NA" or (2) zero? >>> >>> Thanks, >>> >>> Miao >>> >>> >>> >>> >>>> temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE, >>> startRow=2, endRow= 11, startCol=2, endCol=5) >>> >>>> temp >>> >>> Col1 Col2 Col3 Col4 >>> >>> 1 647,853 1,413 57,662 27,897 >>> >>> 2 491,400 1,365 40,919 20,411 >>> >>> 3 38,604 - 5,505 985 >>> >>> 4 576 - 20 54 >>> >>> 5 80,845 21 10,211 4,494 >>> >>> 6 36,428 27 1,007 1,953 >>> >>> 7 269,915 587 32,988 12,779 >>> >>> 8 224,494 - 30,554 9,184 >>> >>> 9 11,858 587 - 686 >>> >>> 10 3,742 - 81 415 >>> >>>> temp[2,2] >>> >>> [1] "1,365" >>> >>>> temp[2,2]+3 >>> >>> Error in temp[2, 2] + 3 : non-numeric argument to binary operator >>> >>>> temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1, >>> header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) >>> >>>> temp_num[2,2] >>> >>> [1] "1,365" >>> >>>> temp_num[2,2]+3 >>> >>> Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator >>> >>>> as.numeric(temp_num[2,2])+3 >>> >>> [1] NA >>> >>> Warning message: >>> >>> NAs introduced by coercion >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.