Re: [R] A question on dummy variable

John Sorkin Tue, 11 Jan 2011 19:39:31 -0800

Christofer,
I am not sure I understand how you are using your dummy variables. Generally if 
you have n categories you need n-1 dummy variables. Thus if you have three 
categories, low, medium, high and want to compare two of the levels to a 
reference level (a coding scheme sometimes called reference cell coding) you 
could use the following coding which medium and high to the reference level, 
low:


level   dummy1 dummy2
low        0     0        
medium     0     1
high       1     0

You will notice that for three categories, my dummy variables from  an 3 by 2 
matrix. In general the dummy variable matrix for n categories will be an n by 
n-1 matrix. You say your have four seasons. I would expect your dummy variable 
matrix to be of size 4 by 3. Your matrices are 6 by 3. Am I not understanding 
what you are trying to do?
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsor...@grecc.umaryland.edu

>>> Christofer Bogaso <bogaso.christo...@gmail.com> 1/11/2011 3:18 PM >>>
Dear all, I would like to ask one question related to statistics, for
specifically on defining dummy variables. As of now, I have come across 3
different kind of dummy variables (assuming I am working with Seasonal
dummy, and number of season is 4):

> dummy1 <- diag(4)
> for(i in 1:3) dummy1 <- rbind(dummy1, diag(4))
> dummy1 <- dummy1[,-4]
>
> dummy2 <- dummy1
> dummy2[dummy2 == 0] = -1/(4-1)
>
> dummy3 <- dummy1 - 1/4
>
> head(dummy1)
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1
[4,]    0    0    0
[5,]    1    0    0
[6,]    0    1    0
> head(dummy2)
           [,1]       [,2]       [,3]
[1,]  1.0000000 -0.3333333 -0.3333333
[2,] -0.3333333  1.0000000 -0.3333333
[3,] -0.3333333 -0.3333333  1.0000000
[4,] -0.3333333 -0.3333333 -0.3333333
[5,]  1.0000000 -0.3333333 -0.3333333
[6,] -0.3333333  1.0000000 -0.3333333
> head(dummy3)
      [,1]  [,2]  [,3]
[1,]  0.75 -0.25 -0.25
[2,] -0.25  0.75 -0.25
[3,] -0.25 -0.25  0.75
[4,] -0.25 -0.25 -0.25
[5,]  0.75 -0.25 -0.25
[6,] -0.25  0.75 -0.25
Now I want to know which type of dummy definition is called Centered dummy
and why it is called so? Is it equivalent to use any of the above
definitions (atleast 2nd and 3rd?) It would really be very helpful if
somebody point any suggestion and clarification.

Thanks and regards,

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question on dummy variable

Reply via email to