Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

Avi Gross via R-help Fri, 28 Jan 2022 15:23:10 -0800


Javed,
Your explanation allows many other ways to look at the problem.

Some of them skip steps and get to the point faster. Of course, I do not know 
what exactly you mean by the "fairness object" other than guessing it does an 
evaluation of what you supply and lets you know if it is fair.

For something categorical like gender it used to be easy to use the table() 
function to show how many of each category you have. Of course, it now seems 
that old assumptions about two genders are being replaced by additional choices 
so it may literally be nonbinary.

Your code looked for 'T14' which gives no clue about purpose. Here is an 
example where I coded the words "male" and "female" in a small sample for 
illustration. You can leave the data as is and have it automatically count or 
take percentages and then extract whatever you want and use it to make 
decisions.

The darn HTML stripper this list uses makes showing code hard, so I have to 
disperse it with extra spacing.
Here is some data:
gender <- c("male", "female", "female", "male", "female", "female", "female")

I made it lopsided and you can see the counts easily enough with:

   tab.cnt <- table(gender)

The output is:
> tab.cnt
genderfemale   male      5      2 

You can of course get percentages using the table object:

   tab.prcnt <- prop.table(tab.cnt)

The output is:

> tab.prcnt
gender   female      male 0.7142857 0.2857143 
You can, of course, multiply the above by a hundred and use round() to trim it 
to fewer digits, but what you can do is extract the numbers to do things like a 
comparison:
Consider deciding that more than 60% females is too much:

if (tab.prcnt[["female"]] > 0.6)  print("too many women")

Your criteria may of course be more complicated, but the thing I am teaching is 
that there are built-in methods that may be used as you get to know not only 
the language but techniques that work well with it. Your need may work well 
with your technique of converting your data representation from one form to a 
numeric form. Realistically, many might simply use another built-in feature 
called factors. Converting my data to a factor does this:

> fact <- factor(gender)> fact[1] male   female female male   female female 
> femaleLevels: female male> as.numeric(fact)[1] 2 1 1 2 1 1 1

The default is to use integers starting with 1 but you can change that in many 
ways, or in the above, simply subtract 1 to get what you want. To get the 
percentage of men in the above, can be something like this:

> mean(as.numeric(fact) - 1)[1] 0.2857143

You may get lots of advice on many methods and ways to do things but pick what 
fits your situation and sometimes you can try to change the situation. For some 
purposes, categorical data needs to be transformed for proper use in something 
like machine learning algorithms but sometimes it can be left alone as shown 
above and the statistics can be worked with. 
From: javed khan <javedbtk...@gmail.com>
To: Avi Gross <avigr...@verizon.net>
Cc: r-help@r-project.org <r-help@r-project.org>
Sent: Fri, Jan 28, 2022 8:34 am
Subject: Re: Error in if (fraction <= 1) { : missing value where TRUE/FALSE 
needed

Avi Gross, thanks for your reply. 
I have no interest of using the zero and one in my code, I mean true false can 
also be ok because I don't have to do some arithmetic with it. 
I just want to pass a protected variable and one of its (privileged) value to 
the fairness object to see if the model has any bias towards the unprivileged 
values of the protected variable. 
You can consider my protected variable as Sex and it's values as male and 
female. I want the fairness object to see if there is any bias towards the 
female group which could be considered as an unprivileged group. 

Thanks

On Thursday, January 27, 2022, Avi Gross via R-help <r-help@r-project.org> 
wrote:

Javed,
You may misunderstand something here.
Forget ifelse() which does all kinds of things (which you can see by just 
typing "ifelse" and a carriage/return or ENTER.
Your initial goal should be kept in mind. You want to create a data structure, 
in this case a vector, that is the same length as another vector called 
test$operator in which you mark whether the corresponding element was exactly 
"T13" or not.
There is nothing fundamentally wrong with your approach albeit it is overkill 
in this case. As has been pointed out, SKIPPING ifelse() entirely, you can get 
a vector of Logicals (TRUE or FALSE) by a simple command like this:
    result <- test$operator == 'T13'
For many purposes, that is all you need. TRUE and FALSE are also sometimes 
mapped into 1 and 0 for various purposes, so you can convert them into integers 
or general numerics is that is needed. Consider the following code that checks 
the integers from 1 to 7 to see if they are even (as in divisible by 2):

> result <- 1:7 %% 2 == 0> result[1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE> 
> as.integer(result)[1] 0 1 0 1 0 1 0> as.numeric(result)[1] 0 1 0 1 0 1 0> 
> result <- as.integer(1:7 %% 2 == 0)> result[1] 0 1 0 1 0 1 0

If for some reason the choice of 1 and 0 is the opposite of what you need, you 
can invert them several ways with the simplest being:

    as.integer(1:7 %% 2 != 0)
or    as.integer(!(1:7 %% 2 != 0))

The first negates the comparison and the second just flips every FALSE and TRUE 
to the other.
Why are we talking about this? For many more interesting cases, ifelse() is 
great as you can replace one or both of the choices with anything. A very 
common case is replacing one choice with itself and changing the other, or 
nesting the comparisons in a sort of simulated tree as in 
    ifelse(some_condition,       ifelse(second_condition, result1, result2),    
     ifelse(third_condition, result3, result4)))

But you seem to want the simplest return of two values that also happen to be 
the underlying equivalent of TRUE and FALSE in many languages. In Python, 
anything that evaluates to zero (or the Boolean value FALSE) tends to be 
treated as FALSE, and anything else like a 1 or 666 is treated as TRUE, as 
shown below:

> if (TRUE) print("TRUE") else print("FALSE")[1] "TRUE"> if (1) print("TRUE") 
> else print("FALSE")[1] "TRUE"> if (666) print("TRUE") else print("FALSE")[1] 
> "TRUE"> if (FALSE) print("TRUE") else print("FALSE")[1] "FALSE"> if (0) 
> print("TRUE") else print("FALSE")[1] "FALSE"

This is why you are being told that for many purposes, the Boolean vector may 
work fine. But if you really want or need zero and one, that is a trivial 
transformation as shown. Feel free to use ifelse() and then figure out what 
went wrong with your code, but also to try the simpler version and see if the 
problem goes away.
Avi
-----Original Message-----
From: javed khan <javedbtk...@gmail.com>
To: Bert Gunter <bgunter.4...@gmail.com>
Cc: R-help <r-help@r-project.org>
Sent: Thu, Jan 27, 2022 1:15 pm
Subject: Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE 
needed

Thank you Bert Gunter

Do you mean I should do something like this:

prot <- (as.numeric(ifelse(test$ operator == 'T13', 1, 0))

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

Reply via email to