sorry, i had assumed readWorksheetFromFile would give you back a data
frame.  all of the operations i recommended work on data.frame objects

at different points in the code, check if it's a data.frame or a matrix..

class( temp )

..you can check its current class at any point.


and if it's a matrix, you can convert it to a data frame with

temp <- as.data.frame( temp )







On Fri, May 3, 2013 at 2:00 AM, jpm miao <miao...@gmail.com> wrote:

> Hi Anthony,
>
>    Thank you very much. It works very well. However, after this line
>
> > temp <- sapply( temp , as.numeric )
>
>    the data becomes a series of numbers instead of a matrix. Is there any
> way to keep it a matrix?
>
>    Thanks,
>
> Miao
>
>
>
>
> > temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
> startRow=2, endRow= 11, startCol=2, endCol=5)
> > temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
> > temp
>       Col1     Col2   Col3    Col4
>  [1,] "647853" "1413" "57662" "27897"
>  [2,] "491400" "1365" "40919" "20411"
>  [3,] "38604"  "-"    "5505"  "985"
>  [4,] "576"    "-"    "20"    "54"
>  [5,] "80845"  "21"   "10211" "4494"
>  [6,] "36428"  "27"   "1007"  "1953"
>  [7,] "269915" "587"  "32988" "12779"
>  [8,] "224494" "-"    "30554" "9184"
>  [9,] "11858"  "587"  "-"     "686"
> [10,] "3742"   "-"    "81"    "415"
>  > temp <- sapply( temp , as.numeric )
> Warning messages:
> 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
> > temp
> 647853 491400  38604    576  80845  36428 269915
> 647853 491400  38604    576  80845  36428 269915
> 224494  11858   3742   1413   1365      -      -
> 224494  11858   3742   1413   1365     NA     NA
>     21     27    587      -    587      -  57662
>     21     27    587     NA    587     NA  57662
>  40919   5505     20  10211   1007  32988  30554
>  40919   5505     20  10211   1007  32988  30554
>      -     81  27897  20411    985     54   4494
>     NA     81  27897  20411    985     54   4494
>   1953  12779   9184    686    415
>   1953  12779   9184    686    415
> > temp[ is.na( temp ) ] <- 0
> > temp
> 647853 491400  38604    576  80845  36428 269915
> 647853 491400  38604    576  80845  36428 269915
> 224494  11858   3742   1413   1365      -      -
> 224494  11858   3742   1413   1365      0      0
>     21     27    587      -    587      -  57662
>     21     27    587      0    587      0  57662
>  40919   5505     20  10211   1007  32988  30554
>  40919   5505     20  10211   1007  32988  30554
>      -     81  27897  20411    985     54   4494
>      0     81  27897  20411    985     54   4494
>   1953  12779   9184    686    415
>   1953  12779   9184    686    415
>
>
> 2013/5/2 Anthony Damico <ajdam...@gmail.com>
>
>> try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
>>
>>
>>
>> if that doesn't work, try a few other steps
>>
>>
>> # view what data types your file is being read in as
>> sapply( temp , class )
>>
>>
>> # convert all fields to character if they're factor variables.. but i
>> don't think you need this, readWorksheet defaults to `character`
>> temp <- sapply( temp , as.character )
>>
>>
>> # you can also convert a subset like this
>> temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character )
>>
>>
>>
>> # remove commas from character strings
>> temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
>>
>> # convert all fields to numeric
>> temp <- sapply( temp , as.numeric )
>>
>> # convert all NA fields to zeroes if you prefer
>> temp[ is.na( temp ) ] <- 0
>>
>>
>>
>>
>>
>> On Wed, May 1, 2013 at 11:55 PM, jpm miao <miao...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>    Attached are two datasheet to be read.
>>>    My raw data "130502temp.xlsx" contains numbers with ' symbols, and
>>> they
>>> can't be read as numbers. Even if I copy and paste as numbers to form a
>>> new
>>> file "130502temp_number1.xlsx", they could not be read smoothly.
>>>
>>>    1. How can I read the datasheet as numbers?
>>>    2. How can I treat the notation "-" as (1) "NA" or (2) zero?
>>>
>>>    Thanks,
>>>
>>> Miao
>>>
>>>
>>>
>>>
>>> > temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
>>> startRow=2, endRow= 11, startCol=2, endCol=5)
>>>
>>> > temp
>>>
>>>       Col1  Col2   Col3   Col4
>>>
>>> 1  647,853 1,413 57,662 27,897
>>>
>>> 2  491,400 1,365 40,919 20,411
>>>
>>> 3   38,604     -  5,505    985
>>>
>>> 4      576     -     20     54
>>>
>>> 5   80,845    21 10,211  4,494
>>>
>>> 6   36,428    27  1,007  1,953
>>>
>>> 7  269,915   587 32,988 12,779
>>>
>>> 8  224,494     - 30,554  9,184
>>>
>>> 9   11,858   587      -    686
>>>
>>> 10   3,742     -     81    415
>>>
>>> > temp[2,2]
>>>
>>> [1] "1,365"
>>>
>>> > temp[2,2]+3
>>>
>>> Error in temp[2, 2] + 3 : non-numeric argument to binary operator
>>>
>>> > temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1,
>>> header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)
>>>
>>> > temp_num[2,2]
>>>
>>> [1] "1,365"
>>>
>>> > temp_num[2,2]+3
>>>
>>> Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator
>>>
>>> > as.numeric(temp_num[2,2])+3
>>>
>>> [1] NA
>>>
>>> Warning message:
>>>
>>> NAs introduced by coercion
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to