Thanks Bert, I'm reading some books now. But it takes me a while to get familiar R.
Best, Kai On Friday, July 9, 2021, 03:06:11 PM PDT, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: On 09/07/2021 5:51 p.m., Jeff Newmiller wrote: > "Strictly speaking", Greg is correct, Bert. > > https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects > > Lists in R are vectors. What we colloquially refer to as "vectors" are more > precisely referred to as "atomic vectors". And without a doubt, this "vector" > nature of lists is a key underlying concept that explains why adding a dim > attribute creates a matrix that can hold data frames. It is also a stumbling > block for programmers from other languages that have things like linked lists. I would also object to v3 (below) as "extracting" a column from d. "d[2]" doesn't extract anything, it "subsets" the data frame, so the result is a data frame, not what you get when you extract something from a data frame. People don't realize that "x <- 1:10; y <- x[[3]]" is perfectly legal. That extracts the 3rd element (the number 3). The problem is that R has no way to represent a scalar number, only a vector of numbers, so x[[3]] gets promoted to a vector containing that number when it is returned and assigned to y. Lists are vectors of R objects, so if x is a list, x[[3]] is something that can be returned, and it is different from x[3]. Duncan Murdoch > > On July 9, 2021 2:36:19 PM PDT, Bert Gunter <bgunter.4...@gmail.com> wrote: >> "1. a column, when extracted from a data frame, *is* a vector." >> Strictly speaking, this is false; it depends on exactly what is meant >> by "extracted." e.g.: >> >>> d <- data.frame(col1 = 1:3, col2 = letters[1:3]) >>> v1 <- d[,2] ## a vector >>> v2 <- d[[2]] ## the same, i.e >>> identical(v1,v2) >> [1] TRUE >>> v3 <- d[2] ## a data.frame >>> v1 >> [1] "a" "b" "c" ## a character vector >>> v3 >> col2 >> 1 a >> 2 b >> 3 c >>> is.vector(v1) >> [1] TRUE >>> is.vector(v3) >> [1] FALSE >>> class(v3) ## data.frame >> [1] "data.frame" >> ## but >>> is.list(v3) >> [1] TRUE >> >> which is simply explained in ?data.frame (where else?!) by: >> "A data frame is a **list** [emphasis added] of variables of the same >> number of rows with unique row names, given class "data.frame". If no >> variables are included, the row names determine the number of rows." >> >> "2. maybe your question is "is a given function for a vector, or for a >> data frame/matrix/array?". if so, i think the only way is reading >> the help information (?foo)." >> >> Indeed! Is this not what the Help system is for?! But note also that >> the S3 class system may somewhat blur the issue: foo() may work >> appropriately and differently for different (S3) classes of objects. A >> detailed explanation of this behavior can be found in appropriate >> resources or (more tersely) via ?UseMethod . >> >> "you might find reading ?"[" and ?"[.data.frame" useful" >> >> Not just 'useful" -- **essential** if you want to work in R, unless >> one gets this information via any of the numerous online tutorials, >> courses, or books that are available. The Help system is accurate and >> authoritative, but terse. I happen to like this mode of documentation, >> but others may prefer more extended expositions. I stand by this claim >> even if one chooses to use the "Tidyverse", data.table package, or >> other alternative frameworks for handling data. Again, others may >> disagree, but R is structured around these basics, and imo one remains >> ignorant of them at their peril. >> >> Cheers, >> Bert >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minsh...@umich.edu> >> wrote: >>> >>> Kai, >>> >>>> one more question, how can I know if the function is for column >>>> manipulations or for vector? >>> >>> i still stumble around R code. but, i'd say the following (and look >>> forward to being corrected! :): >>> >>> 1. a column, when extracted from a data frame, *is* a vector. >>> >>> 2. maybe your question is "is a given function for a vector, or for >> a >>> data frame/matrix/array?". if so, i think the only way is >> reading >>> the help information (?foo). >>> >>> 3. sometimes, extracting the column as a vector from a data >> frame-like >>> object might be non-intuitive. you might find reading ?"[" and >>> ?"[.data.frame" useful (as well as ?"[.data.table" if you use >> that >>> package). also, the str() command can be helpful in >> understanding >>> what is happening. (the lobstr:: package's sxp() function, as >> well >>> as more verbose .Internal(inspect()) can also give you insight.) >>> >>> with the data.table:: package, for example, if "DT" is a >> data.table >>> object, with "x2" as a column, adding or leaving off quotation >> marks >>> for the column name can make all the difference between ending up >>> with a vector, or with a (much reduced) data table: >>> ---- >>>> is.vector(DT[, x2]) >>> [1] TRUE >>>> str(DT[, x2]) >>> num [1:9] 32 32 32 32 32 32 32 32 32 >>>> >>>> is.vector(DT[, "x2"]) >>> [1] FALSE >>>> str(DT[, "x2"]) >>> Classes ‘data.table’ and 'data.frame': 9 obs. of 1 variable: >>> $ x2: num 32 32 32 32 32 32 32 32 32 >>> - attr(*, ".internal.selfref")=<externalptr> >>> ---- >>> >>> a second level of indexing may or may not help, mostly depending >> on >>> the use of '[' versus of '[['. this can sometimes cause >> confusion >>> when you are learning the language. >>> ---- >>>> str(DT[, "x2"][1]) >>> Classes ‘data.table’ and 'data.frame': 1 obs. of 1 variable: >>> $ x2: num 32 >>> - attr(*, ".internal.selfref")=<externalptr> >>>> str(DT[, "x2"][[1]]) >>> num [1:9] 32 32 32 32 32 32 32 32 32 >>> ---- >>> >>> the tibble:: package (used in, e.g., the dplyr:: package) also >>> (always?) returns a single column as a non-vector. again, a >>> second indexing with double '[[]]' can produce a vector. >>> ---- >>>> DP <- tibble(DT) >>>> is.vector(DP[, "x2"]) >>> [1] FALSE >>>> is.vector(DP[, "x2"][[1]]) >>> [1] TRUE >>> ---- >>> >>> but, note that a list of lists is also a vector: >>>> is.vector(list(list(1), list(1,2,3))) >>> [1] TRUE >>>> str(list(list(1), list(1,2,3))) >>> List of 2 >>> $ :List of 1 >>> ..$ : num 1 >>> $ :List of 3 >>> ..$ : num 1 >>> ..$ : num 2 >>> ..$ : num 3 >>> >>> etc. >>> >>> hth. good luck learning! >>> >>> cheers, Greg >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.