Hi Steve,

These might help.

#generate some example data similar to your original email, with one extra
line containing all NAs (first col included to provide similarity with your
data)

labs <- c('4er66', '4gcyi', '5d3hh', '5d3wt', 'v3st5', 'a22g5', 'b5dd3',
'g44d2', 'zzzzz')
dat <- scan()
1 NA  1  0 NA  0
0  0  1  0  0  0
0  0  0 NA  0  0
0  0  0  0  0  0
NA NA  1 NA NA NA
NA  0 NA NA NA NA
NA  0 NA NA NA NA
NA  0 NA NA NA NA
NA NA NA NA NA NA

data <- data.frame(labs=labs, dat=matrix(dat, byrow=T, ncol=6))

# I think this line of code gives what you want (number of rows that contain
at least 1 non-na value).
sum(apply(data[,-1], 1, function(x) any(!is.na(x))))

# to produce the line "(all)" in your original output (assuming that this
line counts the number of non-na entries in each col):
apply(data[,-1], 2, function(x) sum(!is.na(x)))

Regards,
Simon Knapp





On Mon, Jan 18, 2010 at 8:37 AM, Steve Sidney <sbsid...@mweb.co.za> wrote:

> Well now I am totally baffled !!!!!!!!!!
>
> Using
>
> sum( !is.na(b[,3])) I get the total of all col 3 except those that are NA
> -
> Great solves the first problem
>
> What I can't seem to do is use the same logic to count all the 1's in that
> col, which are there before I use the cast with margins.
>
> So it seems to me that somehow sum((b[,3]) == 1)  is wrong and is the part
> of my understanding that's missing.
>
> My guess is that that before using margins and sum in the cast statement
> the col is a character type and in order for == 1 to work I need to convert
> this to an integer.
>
> Hope this helps you to understand the problem.
>
> Regards
> Steve
>
> Your help is much appreciated
>
> ----- Original Message ----- From: "David Winsemius" <
> dwinsem...@comcast.net>
> To: "Steve Sidney" <sbsid...@mweb.co.za>
> Cc: <r-help@r-project.org>
> Sent: Sunday, January 17, 2010 7:36 PM
>
> Subject: Re: [R] Help using Cast (Text) Version
>
>
>
>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>
>>  David
>>>
>>> Thanks, I'll try that......but no what I need is the total (1's) for
>>> each of the rows, labelled 1-6 at the top of each col in the table
>>> provided.
>>>
>>
>> Part of my confusion with your request (which remains unaddressed) is
>> what you mean by "valid". The melt-cast operation has turned a bunch  of
>> NA's into 0's which are now indistinguishable from the original  0's. So I
>> don't see any way that operating on "b" could tell you the  numbers you
>> are asking for. If you were working on the original data,  "res", you
>> might have gotten the column-wise "valid" counts of column  2 with
>> something like:
>>
>>  sum( !is.na(res[,2]) )
>>
>>
>>> What I guess I am not sure of is how to identify the col after the  melt
>>> and cast.
>>>
>>
>> The cast object represents columns as a list of vectors. The i-th  column
>> is b[[i]] which could be further referenced as a vector. So the  j-th row
>> entry for the i-th column would be b[[i]][j].
>>
>>
>>
>>> Steve
>>>
>>> ----- Original Message ----- From: "David Winsemius"
>>> <dwinsem...@comcast.net
>>> >
>>> To: "Steve Sidney" <sbsid...@mweb.co.za>
>>> Cc: <r-help@r-project.org>
>>> Sent: Sunday, January 17, 2010 4:39 PM
>>> Subject: Re: [R] Help using Cast (Text) Version
>>>
>>>
>>>
>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>
>>>>  Sorry to repeat the meassage, not sure if the HTML version has  been
>>>>> received - Apologies for duplication
>>>>>
>>>>> Dear list
>>>>>
>>>>> I am trying to count the no of occurances in a column of a data   frame
>>>>> and there is missing data identifed by NA.
>>>>>
>>>>> I am able to melt and cast the data correctly as well as sum the
>>>>> occurances using margins and sum.
>>>>>
>>>>> Here are the melt and cast commands
>>>>>
>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>
>>>>> Sample Data (before using sum and margins)
>>>>>
>>>>>  lab  1  2  3  4  5  6
>>>>> 1  4er66  1 NA  1  0 NA  0
>>>>> 2  4gcyi  0  0  1  0  0  0
>>>>> 3  5d3hh  0  0  0 NA  0  0
>>>>> 4  5d3wt  0  0  0  0  0  0
>>>>> .
>>>>> . lines deleted to save space
>>>>> .
>>>>> 69 v3st5 NA NA  1 NA NA NA
>>>>> 70 a22g5 NA  0 NA NA NA NA
>>>>> 71 b5dd3 NA  0 NA NA NA NA
>>>>> 72 g44d2 NA  0 NA NA NA NA
>>>>>
>>>>> Data after using sum and margins
>>>>>
>>>>>  lab 1 2 3 4 5 6 (all)
>>>>> 1  4er66 1 0 1 0 0 0     2
>>>>> 2  4gcyi 0 0 1 0 0 0     1
>>>>> 3  5d3hh 0 0 0 0 0 0     0
>>>>> 4  5d3wt 0 0 0 0 0 0     0
>>>>> 5  6n44r 0 0 0 0 0 0     0
>>>>> .
>>>>> .lines deleted to save space
>>>>> .
>>>>> 70 a22g5 0 0 0 0 0 0     0
>>>>> 71 b5dd3 0 0 0 0 0 0     0
>>>>> 72 g44d2 0 0 0 0 0 0     0
>>>>> 73 (all) 5 2 4 3 5 7    26
>>>>>
>>>>> Uisng length just tells me how many total rows there are.
>>>>>
>>>>
>>>>
>>>>  What I need to do is count how many rows there is valid data, in   this
>>>>> case either a one (1) or a zero (0) in b
>>>>>
>>>>
>>>> I'm guessing that you mean to apply that test to the column in b
>>>> labeled "(all)" . If that's the case, then something like  (obviously
>>>> untested):
>>>>
>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>
>>>>
>>>>
>>>>
>>>>> I have a report to construct for tomorrow Mon so any help would be
>>>>> appreciated
>>>>>
>>>>> Regards
>>>>> Steve
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>>>
>>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to