It's still kind of weird; embedded 2-column data frames print differently than 1-column ones:
> d <- data.frame(a=1, b=I(data.frame(d=1,e=2))) > d a b.d b.e 1 1 1 2 > str(d) 'data.frame': 1 obs. of 2 variables: $ a: num 1 $ b:Classes 'AsIs' and 'data.frame': 1 obs. of 2 variables: ..$ d: num 1 ..$ e: num 2 > names(d) [1] "a" "b" > d <- data.frame(a=1, b=I(data.frame(d=1))) > d a d 1 1 1 > str(d) 'data.frame': 1 obs. of 2 variables: $ a: num 1 $ b:Classes 'AsIs' and 'data.frame': 1 obs. of 1 variable: ..$ d: num 1 > names(d) [1] "a" "b" It is happening inside format.data.frame() or as.data.frame.list() but I can't figure out the logic at this point. -pd > On 26 Oct 2023, at 10:55 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > > On 25/10/2023 2:18 a.m., Christian Asseburg wrote: >> Hi! I came across this unexpected behaviour in R. First I thought it was a >> bug in the assignment operator <- but now I think it's maybe a bug in the >> way data frames are being printed. What do you think? >> Using R 4.3.1: >>> x <- data.frame(A = 1, B = 2, C = 3) >>> y <- data.frame(A = 1) >>> x >> A B C >> 1 1 2 3 >>> x$B <- y$A # works as expected >>> x >> A B C >> 1 1 1 3 >>> x$C <- y[1] # makes C disappear >>> x >> A B A >> 1 1 1 1 >>> str(x) >> 'data.frame': 1 obs. of 3 variables: >> $ A: num 1 >> $ B: num 1 >> $ C:'data.frame': 1 obs. of 1 variable: >> ..$ A: num 1 >> Why does the print(x) not show "C" as the name of the third element? I did >> mess up the data frame (and this was a mistake on my part), but finding the >> bug was harder because print(x) didn't show the C any longer. > > y[1] is a dataframe with one column, i.e. it is identical to y. To get the > result you expected, you should have used y[[1]], to extract column 1. > > Since dataframes are lists, you can assign them as columns of other > dataframes, and you'll create a single column in the result whose rows are > the columns of the dataframe you're assigning. This means that > > x$C <- y[1] > > replaces the C column of x with a dataframe. It retains the name C (you can > see this if you print names(x) ), but since the column contains a dataframe, > it chooses to use the column name of y when printing. > > If you try > > x$D <- x > > you'll see it generate new names when printing, but the names within x remain > as A, B, C, D. > > This is a situation where tibbles do a better job than dataframes: if you > created x and y as tibbles instead of dataframes and executed your code, > you'd see this: > > library(tibble) > x <- tibble(A = 1, B = 2, C = 3) > y <- tibble(A = 1) > x$C <- y[1] > x > #> # A tibble: 1 × 3 > #> A B C$A > #> <dbl> <dbl> <dbl> > #> 1 1 2 1 > > Duncan Murdoch > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.