OK, I stand somewhat chastised. But my point still is that what you get when you "extract" depends on how you define "extract." Do note that ?"[" yields a help file titled "Extract or Replace Parts of an object"; and afaics, the term "subset" is not explicitly used as Duncan prefers. The relevant part of the Help file says for "[" for recursive objects says: "Indexing by [ is similar to atomic vectors and selects a list of the specified element(s)." That a data.frame is a list is explicitly stated, as I noted; that lists are in fact vectors is also explicitly stated (?list says: "Almost all lists in R internally are Generic Vectors") but then one is stuck with: a data.frame is a list and therefore a vector, but is.vector(d3) is FALSE. The explanation is explicit again in ?is.vector ("is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise."). But I would say these issues are sufficiently murky that my warning to be precise is not entirely inappropriate; unfortunately, I may have made them more so. Sigh....
Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Jul 9, 2021 at 3:05 PM Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > > On 09/07/2021 5:51 p.m., Jeff Newmiller wrote: > > "Strictly speaking", Greg is correct, Bert. > > > > https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects > > > > Lists in R are vectors. What we colloquially refer to as "vectors" are more > > precisely referred to as "atomic vectors". And without a doubt, this > > "vector" nature of lists is a key underlying concept that explains why > > adding a dim attribute creates a matrix that can hold data frames. It is > > also a stumbling block for programmers from other languages that have > > things like linked lists. > > I would also object to v3 (below) as "extracting" a column from d. > "d[2]" doesn't extract anything, it "subsets" the data frame, so the > result is a data frame, not what you get when you extract something from > a data frame. > > People don't realize that "x <- 1:10; y <- x[[3]]" is perfectly legal. > That extracts the 3rd element (the number 3). The problem is that R has > no way to represent a scalar number, only a vector of numbers, so x[[3]] > gets promoted to a vector containing that number when it is returned and > assigned to y. > > Lists are vectors of R objects, so if x is a list, x[[3]] is something > that can be returned, and it is different from x[3]. > > Duncan Murdoch > > > > > On July 9, 2021 2:36:19 PM PDT, Bert Gunter <bgunter.4...@gmail.com> wrote: > >> "1. a column, when extracted from a data frame, *is* a vector." > >> Strictly speaking, this is false; it depends on exactly what is meant > >> by "extracted." e.g.: > >> > >>> d <- data.frame(col1 = 1:3, col2 = letters[1:3]) > >>> v1 <- d[,2] ## a vector > >>> v2 <- d[[2]] ## the same, i.e > >>> identical(v1,v2) > >> [1] TRUE > >>> v3 <- d[2] ## a data.frame > >>> v1 > >> [1] "a" "b" "c" ## a character vector > >>> v3 > >> col2 > >> 1 a > >> 2 b > >> 3 c > >>> is.vector(v1) > >> [1] TRUE > >>> is.vector(v3) > >> [1] FALSE > >>> class(v3) ## data.frame > >> [1] "data.frame" > >> ## but > >>> is.list(v3) > >> [1] TRUE > >> > >> which is simply explained in ?data.frame (where else?!) by: > >> "A data frame is a **list** [emphasis added] of variables of the same > >> number of rows with unique row names, given class "data.frame". If no > >> variables are included, the row names determine the number of rows." > >> > >> "2. maybe your question is "is a given function for a vector, or for a > >> data frame/matrix/array?". if so, i think the only way is reading > >> the help information (?foo)." > >> > >> Indeed! Is this not what the Help system is for?! But note also that > >> the S3 class system may somewhat blur the issue: foo() may work > >> appropriately and differently for different (S3) classes of objects. A > >> detailed explanation of this behavior can be found in appropriate > >> resources or (more tersely) via ?UseMethod . > >> > >> "you might find reading ?"[" and ?"[.data.frame" useful" > >> > >> Not just 'useful" -- **essential** if you want to work in R, unless > >> one gets this information via any of the numerous online tutorials, > >> courses, or books that are available. The Help system is accurate and > >> authoritative, but terse. I happen to like this mode of documentation, > >> but others may prefer more extended expositions. I stand by this claim > >> even if one chooses to use the "Tidyverse", data.table package, or > >> other alternative frameworks for handling data. Again, others may > >> disagree, but R is structured around these basics, and imo one remains > >> ignorant of them at their peril. > >> > >> Cheers, > >> Bert > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minsh...@umich.edu> > >> wrote: > >>> > >>> Kai, > >>> > >>>> one more question, how can I know if the function is for column > >>>> manipulations or for vector? > >>> > >>> i still stumble around R code. but, i'd say the following (and look > >>> forward to being corrected! :): > >>> > >>> 1. a column, when extracted from a data frame, *is* a vector. > >>> > >>> 2. maybe your question is "is a given function for a vector, or for > >> a > >>> data frame/matrix/array?". if so, i think the only way is > >> reading > >>> the help information (?foo). > >>> > >>> 3. sometimes, extracting the column as a vector from a data > >> frame-like > >>> object might be non-intuitive. you might find reading ?"[" and > >>> ?"[.data.frame" useful (as well as ?"[.data.table" if you use > >> that > >>> package). also, the str() command can be helpful in > >> understanding > >>> what is happening. (the lobstr:: package's sxp() function, as > >> well > >>> as more verbose .Internal(inspect()) can also give you insight.) > >>> > >>> with the data.table:: package, for example, if "DT" is a > >> data.table > >>> object, with "x2" as a column, adding or leaving off quotation > >> marks > >>> for the column name can make all the difference between ending up > >>> with a vector, or with a (much reduced) data table: > >>> ---- > >>>> is.vector(DT[, x2]) > >>> [1] TRUE > >>>> str(DT[, x2]) > >>> num [1:9] 32 32 32 32 32 32 32 32 32 > >>>> > >>>> is.vector(DT[, "x2"]) > >>> [1] FALSE > >>>> str(DT[, "x2"]) > >>> Classes ‘data.table’ and 'data.frame': 9 obs. of 1 variable: > >>> $ x2: num 32 32 32 32 32 32 32 32 32 > >>> - attr(*, ".internal.selfref")=<externalptr> > >>> ---- > >>> > >>> a second level of indexing may or may not help, mostly depending > >> on > >>> the use of '[' versus of '[['. this can sometimes cause > >> confusion > >>> when you are learning the language. > >>> ---- > >>>> str(DT[, "x2"][1]) > >>> Classes ‘data.table’ and 'data.frame': 1 obs. of 1 variable: > >>> $ x2: num 32 > >>> - attr(*, ".internal.selfref")=<externalptr> > >>>> str(DT[, "x2"][[1]]) > >>> num [1:9] 32 32 32 32 32 32 32 32 32 > >>> ---- > >>> > >>> the tibble:: package (used in, e.g., the dplyr:: package) also > >>> (always?) returns a single column as a non-vector. again, a > >>> second indexing with double '[[]]' can produce a vector. > >>> ---- > >>>> DP <- tibble(DT) > >>>> is.vector(DP[, "x2"]) > >>> [1] FALSE > >>>> is.vector(DP[, "x2"][[1]]) > >>> [1] TRUE > >>> ---- > >>> > >>> but, note that a list of lists is also a vector: > >>>> is.vector(list(list(1), list(1,2,3))) > >>> [1] TRUE > >>>> str(list(list(1), list(1,2,3))) > >>> List of 2 > >>> $ :List of 1 > >>> ..$ : num 1 > >>> $ :List of 3 > >>> ..$ : num 1 > >>> ..$ : num 2 > >>> ..$ : num 3 > >>> > >>> etc. > >>> > >>> hth. good luck learning! > >>> > >>> cheers, Greg > >>> > >>> ______________________________________________ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.