It is not about outlawing matrix notation... to the contrary, it is about consistency. For tibbles, [] always returns another tibble. If you wanted a column vector, you should have asked for a column vector. Does the fact that DF[ 1, ] yields a different type than DF[ , 1 ] and DF[ 1:2, ] satisfy your desire to "support" matrix notation? Matlab has no concept of vectors distinct from row or column matrices, but R tries too hard to blur the lines between vectors and matrix-like objects. The "drop" argument was a mistaken hack in defense of this failure to live with the difference between vectors and matrix-like objects and data frames.
On December 21, 2021 10:09:14 AM PST, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: >On 21/12/2021 12:53 p.m., Duncan Murdoch wrote: >> On 21/12/2021 12:29 p.m., Jeff Newmiller wrote: >>> It is a very rational choice, not a design flaw. I don't like every choice >>> they have made for that class, but this one is very solid, and treating >>> data frames as lists of columns consistently helps all of us. >> I think outlawing matrix notation is a really bad idea. It makes code >> harder to read, and makes it much harder to switch to matrices, which >> sometimes gives a huge speed boost to code. >> >> For example, John Fox posted an example that showed that operations on >> whole columns of dataframes is about twice as fast using list notation >> as using matrix notation. But for operating on whole rows, > >... or on individual elements ... > > > matrices are >> about 100 times faster than dataframes. You shouldn't use notation that >> makes the switch to matrices more difficult. >> >> Duncan Murdoch >> >>> >>> On December 21, 2021 9:02:56 AM PST, Duncan Murdoch >>> <murdoch.dun...@gmail.com> wrote: >>>> On 21/12/2021 11:59 a.m., Jeff Newmiller wrote: >>>>> Intuitive, perhaps, but noticably slower. And it doesn't work on tibbles >>>>> by design. Data frames are lists of columns. >>>> >>>> That's just one of the design flaws in tibbles, but not the worst one. >>>> >>>> Duncan Murdoch >>>> >>>>> >>>>> On December 21, 2021 8:38:35 AM PST, Duncan Murdoch >>>>> <murdoch.dun...@gmail.com> wrote: >>>>>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote: >>>>>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote: >>>>>>>> Thanks for the reply. >>>>>>>> >>>>>>>> sort(unique(Data[1])) >>>>>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = >>>>>>>> decreasing)) : >>>>>>>> undefined columns selected >>>>>>> >>>>>>> That's the wrong syntax: Data[1] is not "column one of Data". Use >>>>>>> Data[[1]] for that, so >>>>>>> >>>>>>> sort(unique(Data[[1]])) >>>>>> >>>>>> Actually, I'd probably recommend >>>>>> >>>>>> sort(unique(Data[, 1])) >>>>>> >>>>>> instead. This treats Data as a matrix rather than as a list. >>>>>> Dataframes are lists that look like matrices, but to me the matrix >>>>>> aspect is usually more intuitive. >>>>>> >>>>>> Duncan Murdoch >>>>>> >>>>>>> >>>>>>> I think Rui already pointed out the typo in the quoted text below... >>>>>>> >>>>>>> Duncan Murdoch >>>>>>> >>>>>>>> >>>>>>>> The recommended syntax did not work, as listed above. >>>>>>>> >>>>>>>> What I want is the sort of distinct column output. Again, the column >>>>>>>> may >>>>>>>> be text or numbers. This is a huge analysis effort with data coming at >>>>>>>> me from many different sources. >>>>>>>> >>>>>>>> >>>>>>>> *Stephen Dawson, DSL* >>>>>>>> /Executive Strategy Consultant/ >>>>>>>> Business & Technology >>>>>>>> +1 (865) 804-3454 >>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>> >>>>>>>> >>>>>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote: >>>>>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote: >>>>>>>>>> Thanks everyone for the replies. >>>>>>>>>> >>>>>>>>>> It is clear one either needs to write a function or put the unique >>>>>>>>>> entries into another dataframe. >>>>>>>>>> >>>>>>>>>> It seems odd R cannot sort a list of unique column entries with ease. >>>>>>>>>> Python and SQL can do it with ease. >>>>>>>>> >>>>>>>>> I've seen several responses that looked pretty simple. It's hard to >>>>>>>>> beat sort(unique(x)), though there's a fair bit of confusion about >>>>>>>>> what you actually want. Maybe you should post an example of the code >>>>>>>>> you'd use in Python? >>>>>>>>> >>>>>>>>> Duncan Murdoch >>>>>>>>> >>>>>>>>>> >>>>>>>>>> QUESTION >>>>>>>>>> Is there a simpler means than other than the unique function to >>>>>>>>>> capture >>>>>>>>>> distinct column entries, then sort that list? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>> Business & Technology >>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/20/21 5:53 PM, Rui Barradas wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Inline. >>>>>>>>>>> >>>>>>>>>>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu: >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> sort(unique(Data[[1]])) >>>>>>>>>>>> >>>>>>>>>>>> This syntax provides row numbers, not column values. >>>>>>>>>>> >>>>>>>>>>> This is not right. >>>>>>>>>>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]] >>>>>>>>>>> extracts the column vector. >>>>>>>>>>> >>>>>>>>>>> As for my previous answer, it was not addressing the question, I >>>>>>>>>>> misinterpreted it as being a question on how to sort by numeric >>>>>>>>>>> order >>>>>>>>>>> when the data is not numeric. Here is a, hopefully, complete answer. >>>>>>>>>>> Still with package stringr. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> cols_to_sort <- 1:4 >>>>>>>>>>> >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], \(x){ >>>>>>>>>>> stringr::str_sort(unique(x), numeric = TRUE) >>>>>>>>>>> }) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Or using Avi's suggestion of writing a function to do all the work >>>>>>>>>>> and >>>>>>>>>>> simplify the lapply loop later, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...) >>>>>>>>>>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hope this helps, >>>>>>>>>>> >>>>>>>>>>> Rui Barradas >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Stephen Dawson, DSL* >>>>>>>>>>>> /Executive Strategy Consultant/ >>>>>>>>>>>> Business & Technology >>>>>>>>>>>> +1 (865) 804-3454 >>>>>>>>>>>> http://www.shdawson.com <http://www.shdawson.com> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Running a simple syntax set to review entries in dataframe >>>>>>>>>>>>> columns. >>>>>>>>>>>>> Here is the working code. >>>>>>>>>>>>> >>>>>>>>>>>>> Data <- read.csv("./input/Source.csv", header=T) >>>>>>>>>>>>> describe(Data) >>>>>>>>>>>>> summary(Data) >>>>>>>>>>>>> unique(Data[1]) >>>>>>>>>>>>> unique(Data[2]) >>>>>>>>>>>>> unique(Data[3]) >>>>>>>>>>>>> unique(Data[4]) >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to add sort the unique entries. The data in the >>>>>>>>>>>>> various >>>>>>>>>>>>> columns are not defined as numbers, but also text. I realize 1 and >>>>>>>>>>>>> 10 will not sort properly, as the column is not defined as a >>>>>>>>>>>>> number, >>>>>>>>>>>>> but want to see what I have in the columns viewed as sorted. >>>>>>>>>>>>> >>>>>>>>>>>>> QUESTION >>>>>>>>>>>>> What is the best process to sort unique output, please? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> ______________________________________________ >>>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> ______________________________________________ >>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>> >> > -- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.