Hello All,

At the suggestion of commenters on a discussion at stack overflow ( 
http://stackoverflow.com/questions/1535021/whats-the-biggest-r-gotcha-youve-run-across/1535433#1535433
 ), I'm forwarding the following behavior report to this list. 

R Session:
> a<-data.frame(c(1,2,3,4),c(4,3,2,1))
> a<-a[-3,]
> a
  c.1..2..3..4. c.4..3..2..1.
1             1             4
2             2             3
4             4             1
> a[4,1]<-1
> a
Error in data.frame(c.1..2..3..4. = c("1", "2", "4", "1"), c.4..3..2..1. = c(" 
4",  : 
  duplicate row.names: 4

What's going on:
    1. A four row data.frame is created, so the rownames are c(1,2,3,4) 
    2. The third row is deleted, so the rownames are c(1,2,4) 
    3. A fourth row is added, and R automatically sets the row name equal to 
the index i.e. 4, so the row names are c(1,2,4,4).
    4. print.data.frame throws an error because it requires unique row names

It seems to me that either R should automatically generate a unique row names, 
or print.data.frame should accept duplicates. Looking at the manual 2.3.2, it 
is unclear whether row names are required to be unique, but the help page for 
data.frame states: "A data frame is a list of variables of the same number of 
rows with unique row names,..." This implies that a[4,1]<-1 creates an invalid 
data.frame object.



Cheers,
Ian Fellows

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to