Re: [R] Problem with rowMeans()

ss Thu, 12 Jun 2008 17:15:47 -0700

Dear Erik and Wacek,

I would request that you stop working on my problem. I had the second column
deleted
and the problem is gone. I don't know why but apparently the second column
somehow
interfered with the third column such that the third column is regarded as
'factor' not
'numeric'.


I can recover the 2nd column, which is gene symbol later so I cannot worry
about it
for now. I just don't want you to invest your precious time on this.

Thanks much,
    Allen

On Thu, Jun 12, 2008 at 8:01 PM, ss <[EMAIL PROTECTED]> wrote:

> Thanks, Erik. I will try your code soon.
>
> I did this first:
>
> > data <-
> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
> row.names = NULL ,header=TRUE, fill=TRUE)
> > class(data[[3]])
> [1] "factor"
> > is.numeric(data[[3]])
> [1] FALSE
> >
>
> So it is not numeric but 'factor' instead.
> Can I convert this column to numeric?
>
> Allen
>
>
> On Thu, Jun 12, 2008 at 7:48 PM, Erik Iverson <[EMAIL PROTECTED]>
> wrote:
>
>>
>>
>> ss wrote:
>>
>>> It is:
>>>
>>>  > data <-
>>> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
>>> row.names = NULL ,header=TRUE, fill=TRUE)
>>>  > class(data[3])
>>> [1] "data.frame"
>>>  >
>>>
>>>
>> Oops, should have said  class(data[[3]]) and
>> is.numeric(data[[3]])
>>
>> See ?Extract
>>
>>
>>> And if I try to use as.matrix(read.table()), I got:
>>>
>>>  >data
>>> <-as.matrix(read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
>>> + row.names = NULL ,header=TRUE, fill=TRUE))
>>>  > data[1:4,1:4]
>>>     Probe_ID       Gene_Symbol M16012391010920 M16012391010525
>>> [1,] "A_23_P105862" "13CDNA73"  "-1.6"          " 0.16"       [2,]
>>> "A_23_P76435"  "15E1.2"    "0.18"          " 0.59"       [3,] "A_24_P402115"
>>> "15E1.2"    "1.63"          "-0.62"       [4,] "A_32_P227764" "15E1.2"
>>>  "-0.76"         "-0.42"
>>> You see they are surrounded by "".
>>>
>>> I don't see such if I just use >read.table
>>>
>>>
>> That is because matrices (objects of class 'matrix') are of homogeneous
>> type.  It changes everything to a character (including the numbers), which
>> you certainly do NOT want.
>>
>> You want a data.frame, I will provide an example of what I think you are
>> after.
>>
>> Try the following commands and see how they compare to your situation:
>> these work for me.
>>
>> test <- data.frame(x = factor(rep(c("A", "B"), each = 13)), y = rnorm(26),
>> z = rnorm(26))
>>
>> test
>>
>> class(test)
>>
>> is.numeric(test[[2]])
>>
>> is.numeric(test[[3]])
>>
>> rowMeans(test)
>>
>> rowMeans(test[2:3])
>>
>>   > data <-
>>> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
>>> row.names = NULL ,header=TRUE, fill=TRUE)
>>>  > data[1:4,1:4]
>>>      Probe_ID Gene_Symbol M16012391010920 M16012391010525
>>> 1 A_23_P105862    13CDNA73            -1.6            0.16
>>> 2  A_23_P76435      15E1.2            0.18            0.59
>>> 3 A_24_P402115      15E1.2            1.63           -0.62
>>> 4 A_32_P227764      15E1.2           -0.76           -0.42
>>>
>>>
>>> Thanks,
>>>      Allen
>>>
>>>
>>>
>>> On Thu, Jun 12, 2008 at 7:34 PM, Erik Iverson <[EMAIL PROTECTED]<mailto:
>>> [EMAIL PROTECTED]>> wrote:
>>>
>>>
>>>
>>>    ss wrote:
>>>
>>>        Hi Wacek,
>>>
>>>        Yes, data is data frame not a matrix.
>>>
>>>            is.numeric(data[3])
>>>
>>>        [1] FALSE
>>>
>>>
>>>    what is class(data[3])
>>>
>>>
>>>        But I looked at the column 3 and it looks okay though. There are
>>>        few NAs and
>>>        I did find
>>>        anything strange.
>>>
>>>        Any suggestions?
>>>
>>>        Thanks,
>>>             Allen
>>>
>>>
>>>
>>>        On Thu, Jun 12, 2008 at 7:01 PM, Wacek Kusnierczyk <
>>>        [EMAIL PROTECTED]
>>>        <mailto:[EMAIL PROTECTED]>> wrote:
>>>
>>>            ss wrote:
>>>
>>>                Thank you very much, Wacek! It works very well.
>>>                But there is a minor problem. I did the following:
>>>
>>>                    data <-
>>>
>>>
>>>  read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
>>>                +row.names = NULL ,header=TRUE, fill=TRUE)
>>>
>>>            looks like you have a data frame, not a matrix
>>>
>>>
>>>                    dim(data)
>>>
>>>                [1] 23963    85
>>>
>>>                    data[1:4,1:4]
>>>
>>>                     Probe_ID Gene_Symbol M16012391010920 M16012391010525
>>>                1 A_23_P105862    13CDNA73            -1.6            0.16
>>>                2  A_23_P76435      15E1.2            0.18            0.59
>>>                3 A_24_P402115      15E1.2            1.63           -0.62
>>>                4 A_32_P227764      15E1.2           -0.76           -0.42
>>>
>>>                    data1<-data[sapply(data, is.numeric)]
>>>                    dim(data1)
>>>
>>>                [1] 23963    82
>>>
>>>                    data1[1:4,1:4]
>>>
>>>                 M16012391010525 M16012391010843 M16012391010531
>>>                M16012391010921
>>>                1            0.16           -0.23           -1.40
>>>                   0.90
>>>                2            0.59            0.28           -0.30
>>>                   0.08
>>>                3           -0.62           -0.62           -0.22
>>>                  -0.18
>>>                4           -0.42            0.01            0.28
>>>                  -0.79
>>>
>>>                You will notice that, after using 'data[sapply(data,
>>>                is.numeric)]' and
>>>                getting
>>>                data1, the first sample in data, called
>>>                'M16012391010920', was missed
>>>                in data1.
>>>
>>>                Any further suggestions?
>>>
>>>            surely there must be an entry in column 3 that makes it
>>>            non-numeric.
>>>            what does is.numeric(data[3]) say?  (NAs should not make a
>>>            column
>>>            non-numeric, unless there are only NAs there, which is not
>>>            the case
>>>            here.)  check your data for non-numeric entries in column 3,
>>>            there can
>>>            be a typo.
>>>
>>>            vQ
>>>
>>>
>>>               [[alternative HTML version deleted]]
>>>
>>>        ______________________________________________
>>>        [email protected] <mailto:[email protected]> mailing list
>>>
>>>        https://stat.ethz.ch/mailman/listinfo/r-help
>>>        PLEASE do read the posting guide
>>>        http://www.R-project.org/posting-guide.html
>>>        and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with rowMeans()

Reply via email to