Re: [R] fusion of two matrices (numerical and logical)

2020-09-19 Thread Richard O'Keefe
(1) Using 'C == TRUE' (when you know C is logical) is equivalent to just plain C, only obscure. Similarly, 'C == FALSE' is more confusing than !C. (2) Consider B[C]. The rows of C have 2, 1, 1, 2, 1 TRUE. entries, so the result here *cannot* be a rectangular array. And whatever it

Re: [R] Stats help for dissertation project

2020-09-19 Thread Richard O'Keefe
In fairness to Raija Hallam, I've met masters and doctoral students whose supervisors hadn't a clue. (Heck, I once worked at a University where the staff evaluation process involved taking the means of 5-point ordinal variables...) Two of these cases stick in my mind: one where the student had mo

Re: [R] Please need help to finalize my code

2020-10-13 Thread Richard O'Keefe
What do you *mean* "when you want to use the kernels". WHICH kernels? Use to do WHAT? In your browser, visit cran.r-project.org then select "Packages" from the list on the left. Then pick the alphabetic list. Now search for 'kernel'. You will find dozens of matches. On Wed, 14 Oct 2020 at 05:15, P

Re: [R] Need help in R code of the functional data .

2020-10-17 Thread Richard O'Keefe
I do not understand your question. Are you talking about "functional data analysis", the statistical analysis of data where some of the covariates are (samples from) continuous functions? There are books and tutorials about doing that in R. Are you talking about "functional data structures", as

Re: [R] FREDR and R 3.6

2020-10-31 Thread Richard O'Keefe
I'm running Ubuntu 18.04 LTS and r-base/bionic-cran35,now 3.6.3-1bionic all [installed] 3.6.3 is also the latest version in the repository. On Fri, 30 Oct 2020 at 12:21, Marc Schwartz via R-help wrote: > > > On Oct 29, 2020, at 6:35 PM, H wrote: > > > > On 10/29/2020 01:49 PM, Marc Schwartz wr

Re: [R] change frequency of wind data correctly

2020-12-06 Thread Richard O'Keefe
To be honest, I would do this one of two ways. (1) Use ?decimate from library(signal), decimating by a factor of three. (2) Convert the variable to an (n/3)*3 matrix using as.matrix then use rowMeans or apply. On Thu, 3 Dec 2020 at 06:55, Stefano Sofia wrote: > Dear list users, > I hav

Re: [R] Problem in cluster sampling: 'mixed with negative subscripts'

2020-12-19 Thread Richard O'Keefe
More accurately, in x[i] where x and i are simple vectors, i may be a mix of positive integers and zeros where the zeros contribute nothing to the result or it may be a MIX of negative integers and zeros where the zeros contribute nothing to the result and -k means "do not include element k".

Re: [R] Image processing in R for BMI calculation

2021-03-01 Thread Richard O'Keefe
"Body Mass Index" is a rather bizarre thing: body.mass.in.kg / height.in.m^2 I have never been able to find any biological or physical meaning for this. Yet clinicians are solemnly advised to measure the weight to the nearest 0.1kg and the height to the nearest 0.1cm. How do you propose to determ

Re: [R] Help in modifying code to extract data from url

2021-05-22 Thread Richard O'Keefe
The source being a URL is not important. The important things are - what the structure of the JSON data is - what the MEANING of the JSON data is - what that meaning says about what SHOULD appear in the data from in these cases. Arguably this isn't even an R question at all. It's a question

Re: [R] Puzzled over "partial"

2021-07-27 Thread Richard O'Keefe
In this context, "partial" is not the name of any function or package in R. It is just the name of a parameter. And its meaning, which is specific to sort(), is spelled out in the documentation for sort: > ?sort ... If ‘partial’ is not ‘NULL’, it is taken to contain indices of elements

Re: [R] Cumulates of snowfall within a given interval

2021-08-01 Thread Richard O'Keefe
> x <- c(1,2,3) # a vector of numbers, such as snowfallsum > (cx <- cumsum(x)) # a vector of cumulative sums. 1 3 6 > i <- 1 # The starting point. > j <- 2 # The ending point. > cx[j] - cx[i-1] # sum of x[i] + ... + x[j] ERROR! > cx <- c(0, cx) # Oops, we need this step. > cx[j+1] - cx[i] So usin

Re: [R] Calculation of Age heaping

2021-08-09 Thread Richard O'Keefe
According to Wikipedia, this is the definition of Whipple's index: "The index score is obtained by summing the number of persons in the age range 23 and 62 inclusive, who report ages ending in 0 and 5, dividing that sum by the total population between ages 23 and 62 years inclusive, and multiplyin

Re: [R] Calculation of Age heaping

2021-08-10 Thread Richard O'Keefe
If you want to look at each digit, you should take a step back and think about what the Whipple index is actually doing. Basically, the model underlying the Whipple index is that Pr(age = xy) = Pr(age = x*)Pr(age = *y) if there is no age heaping. Or rather, since the age is restricted to 23..62 (

Re: [R] A glitch (???) in tools::texi2pf.

2021-08-29 Thread Richard O'Keefe
It is a general "feature" of TeX that documents with tables of contents, indices, bibliographies, and so on, have to be "iterated to convergence". A couple of PhD theses came out of Stanford; the problem is in that which page one thing goes on depends on where other things went, which depends on w

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Richard O'Keefe
Why would you need a package for this? > samples.per.day <- 12*24 That's 12 5-minute intervals per hour and 24 hours per day. Generate some fake data. > x <- rnorm(samples.per.day * 365) > length(x) [1] 105120 Reshape the fake data into a matrix where each row represents one 24-hour period. > m

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe
ide.) In an important sense, there is no right way to analyse river flow data *on its own*. On Mon, 30 Aug 2021 at 14:47, Jeff Newmiller wrote: > > IMO assuming periodicity is a bad practice for this. Missing timestamps > happen too, and there is no reason to build a broken analy

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe
affect the point that you probably should not be doing any of this. On Tue, 31 Aug 2021 at 00:42, Rich Shepard wrote: > > On Mon, 30 Aug 2021, Richard O'Keefe wrote: > > > Why would you need a package for this? > >> samples.per.day <- 12*24 > > > > That

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe
g time. On Tue, 31 Aug 2021 at 11:34, Rich Shepard wrote: > > On Tue, 31 Aug 2021, Richard O'Keefe wrote: > > > I made up fake data in order to avoid showing untested code. It's not part > > of the process I was recommending. I expect data recorded every N minutes >

Re: [R] Calculate daily means from 5-minute interval data

2021-08-31 Thread Richard O'Keefe
I wrote: > > By the time you get the data from the USGS, you are already far past the > > point > > where what the instruments can write is important. Rich Shepard replied: > The data are important because they show what's happened in that period of > record. Don't physicians take a medical histor

Re: [R] Splitting a data column randomly into 3 groups

2021-09-03 Thread Richard O'Keefe
Your question is ambiguous. One reading is n <- length(table$Data) m <- n %/% 3 s <- sample(1:n, n) X <- table$Data[s[1:m]] Y <- table$Data[s[(m+1):(2*m)]] Z <- table$Data[s[(m*2+1):(3*m)]] On Fri, 3 Sept 2021 at 13:31, AbouEl-Makarim Aboueissa wrote: > > Dear All: > > How to split

Re: [R] how to find "first" or "last" record after sort in R

2021-09-10 Thread Richard O'Keefe
Let's simplify this to consider a single vector, such as x <- c(1,1,1,2,2,3,3,3,3,4,5,5,5) in which equal elements are in contiguous blocks. > diff(x) [1] 0 0 1 0 1 0 0 0 1 1 0 0 Of course, there could be gaps, or the sequence might be descending instead of ascending. So > diff(x) != 0 We are nea

Re: [R] Reading File Sizes: very slow!

2021-09-25 Thread Richard O'Keefe
On a $150 second-hand laptop with 0.9GB of library, and a single-user installation of R so only one place to look LIBRARY=$HOME/R/x86_64-pc-linux-gnu-library/4.0 cd $LIBRARY echo "kbytes package" du -sk * | sort -k1n took 150 msec to report the disc space needed for every package. That' On Sun,

Re: [R] assumptions about how things are done

2021-10-09 Thread Richard O'Keefe
Colour me confused. if (...) { ... } else { ... } is a control structure. It requires the test to evaluate to a single logical value, then it evaluates one choice completely and the other not at all. It is special syntax. ifelse(..., ..., ...) is not a control structure. It is not special syntax

Re: [R] How to find local minimum between distributions using mixtools?

2021-10-14 Thread Richard O'Keefe
Do you really want the minimum? It sounds as though your model is a*N(x1,s1) + (1-a)*N(x2,s2) where you use mixtools to estimate the parameters. Finding the derivative of that is fairly straightforward calculus, and solving for the derivative being zero gives you extrema (you want the one between

Re: [R] How to find local minimum between distributions using mixtools?

2021-10-14 Thread Richard O'Keefe
automatic method more than > function analysis... > > On Thu, Oct 14, 2021 at 9:06 AM Richard O'Keefe wrote: > > > > Do you really want the minimum? > > It sounds as though your model is a*N(x1,s1) + (1-a)*N(x2,s2) where > > you use mixtools to estimate > &g

Re: [R] Replacing NA s with the average

2021-10-18 Thread Richard O'Keefe
It *sounds* as though you are trying to impute missing data. There are better approaches than just plugging in means. You might want to look into CALIBERrfimpute or missForest. On Tue, 19 Oct 2021 at 01:39, Admire Tarisirayi Chirume wrote: > > Good day colleagues. Below is a csv file attached whi

Re: [R] R vs Numpy

2021-10-31 Thread Richard O'Keefe
Reasons for preferring one to another: - taste. If you like curly braces, you'll prefer R. If you like indentation forced by syntax, you'll prefer Python. - compatibility. This morning I was trying to use a web site where all the Python examples were non-functional due to either of bo

Re: [R] Date read correctly from CSV, then reformatted incorrectly by R

2021-11-21 Thread Richard O'Keefe
CSV data is very often strangely laid out. For analysis, Buffer Date Reading 100... ... 100... ... and so on is more like what a data frame should be. I get quite annoyed when I finally manage to extract data from a government agency only to find that my tax money has been spent on maki

Re: [R] Large data and space use

2021-11-28 Thread Richard O'Keefe
If you have enough data that running out of memory is a serious problem, then a language like R or Python or Octave or Matlab that offers you NO control over storage may not be the best choice. You might need to consider Julia or even Rust. However, if you have enough data that running out of mem

Re: [R] Question about Rfast colMins and colMaxs

2021-12-02 Thread Richard O'Keefe
What puzzles me is why you are not just using lapply(some.data.frame, min) lapply(some.data.frame, max) or as.vector(lapply(...)) Why go to another package for this? Is it the indices you want? col.min.indices <- function (some.data.frame) { v <- sapply(some.data.frame, function (column)

Re: [R] subset data frame problem

2021-12-13 Thread Richard O'Keefe
You want to DELETE rows satisfying the condition P & Q. The subset() function requires an expression saying what you want to RETAIN, so you need subset(PD, !(P & Q)). test <- subset(PD, !(Class == "1st" & Survived == "No")) By de Morgan's laws, !(P & Q) is the same as (!P) | (!Q) so you could als

Re: [R] checkpointing

2021-12-13 Thread Richard O'Keefe
I used to work on a Prolog implementation that did something similar. At any point you could explicitly save a snapshot of the current state and then from the operating system command line, resume it. This wasn't really for checkpointing. It was so that you could load up a customised environment,

Re: [R] checkpointing

2021-12-13 Thread Richard O'Keefe
Use VirtualBox. You can take a 'snapshot' of a running virtual machine, either from the GUI or from the CLI (vboxmanage snapshot ...) and restore it later. This requires NO changes to R. Snapshots can be restored on another machine of the same kind with the same system software. VirtualBox is f

Re: [R] Error Awareness

2021-12-24 Thread Richard O'Keefe
You want to read this: http://adv-r.had.co.nz/Exceptions-Debugging.html It describes all the ways that R can report a problem and all the ways you can catch such a report while still in R. Let me heartily recommend the whole site, or better yet, the book https://www.amazon.com/dp/0815384572/ref=

Re: [R] Convert a character string to variable names

2022-02-10 Thread Richard O'Keefe
mming. > > I have never used get(), so I will keep that in mind. I agree that it > makes life much easier to enter the data in the way it will be analyzed. > > > > > -Original Message- > From: Jeff Newmiller > Sent: Tuesday, February 8, 2022 10:10 PM > T

Re: [R] SDLC methodology for R and Data science......

2022-02-13 Thread Richard O'Keefe
There are at least two ways to use R. If you have devised a statistical/data science technique and are writing a package to be used by other people, that is normal software development that happens to be using R and the R tool. Lots of attention to documentation and tests. Test-Driven Development

Re: [R] Convert a character string to variable names

2022-02-14 Thread Richard O'Keefe
th the spirit of the language. On Mon, 14 Feb 2022 at 14:57, Ebert,Timothy Aaron wrote: > But I find things like this website on mutable and immutable objects in > python “ > https://www.geeksforgeeks.org/mutable-vs-immutable-objects-in-python/” > Would this be better titled “Objects ve

Re: [R] Is there a Truth Table Generator in R?

2022-03-13 Thread Richard O'Keefe
I too have been wondering what "a truth table generator" meant to the OP. There are web sites like https://web.stanford.edu/class/cs103/tools/truth-table-tool/ where you can type in a formula and it will display a truth table with a column for each variable and a column for the result. The last pr

Re: [R] A question about Spatial in Kriging

2022-03-25 Thread Richard O'Keefe
Start with a good book like "Applied Spatial Data Analysis with R". If you want to do spatial data analysis, then you are going to need measurements at lots of different places in space. On Thu, 24 Mar 2022 at 23:14, Hasliza Rusmili wrote: > Thank you very much. I will ask the question there. >

Re: [R] A question about Spatial in Kriging

2022-03-25 Thread Richard O'Keefe
Thank you for the reference to "Spatial Predictive Modeling with R". I look forward to reading it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-

Re: [R] Symbol/String comparison in R

2022-04-14 Thread Richard O'Keefe
under the impression that > this particular collation was in fact supposed to collate according to the > numerical magnitude of the UTF-8 code points but it does not appear to do > so. > > On April 14, 2022 4:25:17 AM PDT, Richard O'Keefe > wrote: > >To the original

Re: [R] Is there a canonical way to pronounce CRAN?

2022-05-06 Thread Richard O'Keefe
I would like to point out that there is an English word "cran" (one syllable, rhymes with "can" "ban" "than" ...). It means " a measure of fresh herrings, equivalent to 37 1/2 gallons". If you are going to use "CRAN" as the name of something, you are going to have to expect me to pronounce it lik

[R] How to represent tree-structured values

2022-05-29 Thread Richard O'Keefe
There is a kind of data I run into fairly often which I have never known how to represent in R, and nothing I've tried really satisfies me. Consider for example ... - injuries ... - injuries to limbs ... - injuries to extremities ... - injuries to hands - i

Re: [R] Create a categorical variable using the deciles of data

2022-06-15 Thread Richard O'Keefe
but they too take one approach to solve a problem rather > than "here is a problem" and "these are all possible solutions." I > appreciate seeing alternative solutions. > > Tim > > -Original Message- > From: R-help On Behalf Of Richard O'Kee

Re: [R] A humble request

2022-07-03 Thread Richard O'Keefe
Zubair Chishti < mzchis...@eco.qau.edu.pk> wrote: > Dear Respected Experts and specifically Professor Richard O'Keefe, > Thank you so much for your precious time and generous help. However, the > problem is still there and I am just unable to resolve it due to the lack > of exper

Re: [R] Please guide

2022-07-11 Thread Richard O'Keefe
(1) Your sample code refers to a file DY_Table.xlsx but the file you attached to a later message is called Data_oil_agri.xlsx and I find it hard to believe that they are the same file. (2) gmail offered me two different ways to download the file, but neither of them worked. Fortun

Re: [R] Please guide

2022-07-11 Thread Richard O'Keefe
he just reply to write the table. So, I > need to know how to write the table for the code mentioned above? > I hope that you got my question now. > > Regards > Chishti > > On Mon, Jul 11, 2022 at 3:05 PM Richard O'Keefe wrote: > >> (1) Your sample code refers to a

Re: [R] Does the function "c" have a character limit?

2022-07-13 Thread Richard O'Keefe
Breaking up the *line* doesn't mean breaking up the *command*. For example, x <- c( "FOOBAR", # 1 ... "FOOBAR", # 4999 "UGGLE") works fine, with source(..), with "R -f ...", and other ways. Each *line* is short, but it's still one *command*. I'd probably put that much data in a fil

Re: [R] Need to insert various rows of data from a data frame after particular rows from another dataframe

2022-07-27 Thread Richard O'Keefe
I'm retired, and I had an hour on my hands while tea cooked and my granddaughter did her homework, and I just *love* showing off how helpful I am. Good news: someone finally looked at your data. (That would be me.) Bad news: it's going to be a lot of work to do what you want to, and YOU SHOULDN'T

Re: [R] Unicode chars

2022-08-25 Thread Richard O'Keefe
PDFLaTeX does support Latin-1, and this is a Latin-1 character. On Thu, 25 Aug 2022 at 15:35, Jeff Newmiller wrote: > Are you aware that pdfLatex does not support Unicode? You need to use > xeLatex. But I don't use Sweave, so I don't know how you go about making > that choice. > > On August 24,

Re: [R] inconsistency in switch statements.....

2022-09-07 Thread Richard O'Keefe
You DON'T need to use backticks. switch() is much older than backticks. Ordinary quotation marks are fine. > switch(as.character(1), "2"="YES", "1"="NO") [1] "NO" On Thu, 8 Sept 2022 at 07:46, akshay kulkarni wrote: > Dear Bert, > Thanks...I went through the doc pages but c

Re: [R] Mathematical working procedure of imputation methods (medianImpute, knnImpute, and bagImpute) in caret package R

2022-09-20 Thread Richard O'Keefe
?preProcess k-nearest neighbor imputation is carried out by finding the k closest samples (Euclidian distance) in the training set. Imputation via bagging fits a bagged tree model for each predictor (as a function of all the others). This method is simple, accurate and acce

Re: [R] Write text file in Fortran format

2022-09-21 Thread Richard O'Keefe
Background: there is a data file whose records, after a header, can be describedby the Fortran format given in the header. YES, you can easily read that file in R, and you don't even need to know anything about Fortran formats to do it. You can read the file as a data frame using read.table using

Re: [R] How long does it take to learn the R programming language?

2022-09-28 Thread Richard O'Keefe
How long does it take to learn R? Meaningless question. Who is learning? Are they new to programming? What other programming languages do they know? Are they new to statistics? What other statistics environments do they know? Are they learning by themselves? Do they have a mentor? Fellow students?

Re: [R] Reading very large text files into R

2022-09-29 Thread Richard O'Keefe
If I had this problem, in the old days I'd've whipped up a tiny AWK script. These days I might use xsv or qsv. BUT first I would want to know why these extra fields are present and what they signify. Are they good data that happen not to be described in the documentation? Do they represent a def

Re: [R] fortune nomination WAS: Re: How long does it take to learn the R programming language?

2022-09-29 Thread Richard O'Keefe
"R longa, vita brevis." On Thu, 29 Sept 2022 at 07:02, Berry, Charles wrote: > Aha! > CCB > > > On Sep 27, 2022, at 6:08 PM, Rolf Turner > wrote: > > > > > > On Mon, 26 Sep 2022 11:14:57 +0800 > > Turritopsis Dohrnii Teo En Ming wrote: > > > >> Subject: How long does it take to learn the R pro

Re: [R] cannot print a list with cat

2022-10-26 Thread Richard O'Keefe
\n is for TERMINATING lines. Just like in C, C++, Java, C#, Python, Ruby, Erlang, pretty much everything that uses \n in strings at all. sprintf("gradtol = %e\n", mycontrol$gradtol) makes sense. More generally, sprintf() takes as many arguments as you care to give it, so cat(sprintf("tol = %e\nr

Re: [R] unexpected 'else' in " else"

2022-10-26 Thread Richard O'Keefe
This is explained in books about S and R. The first place to look is of course > ?"if" which says Note that it is a common mistake to forget to put braces ('{ .. }') around your statements, e.g., after 'if(..)' or 'for()'. In particular, you should not have a newline between '}'

Re: [R] Single pdf of all R vignettes request

2022-10-31 Thread Richard O'Keefe
Let's put some numbers on that. The CRAN package repository claims 18770 packages. That excludes packages in other repositories, of course; the total collection of vignettes may not be discoverable. It could be useful to collect documents and vignettes and stuff them into an information retrieval s

Re: [R] Preexisting Work on Data- and Control-Flow Analysis

2022-12-07 Thread Richard O'Keefe
You should probably look at the compiler. One issue with data and control flow analysis in R is that f <- function (x, y) x + y f(ping, pong) may invoke an S3 (see ?S3groupGeneric, Ops) or S4 (see ?Arith) method, which might not have existed when f was analysed. Indeed, f <- function (x, y

Re: [R] Integer division

2022-12-19 Thread Richard O'Keefe
The Fortran '08 standard says << One operand of type integer may be divided by another operand of type integer. Although the mathematical quotient of two integers is not necessarily an integer, Table 7.2 specifies that an expression involving the division operator with two operands of type integer

Re: [R] Integer division

2022-12-20 Thread Richard O'Keefe
I was surprised that > there is no consensus regarding the definition of such elementary > functions. > > Göran > > On 2022-12-20 03:01, Richard O'Keefe wrote: > > The Fortran '08 standard says << > > One operand of type integer may be divided by anothe

Re: [R] Pipe operator

2023-01-03 Thread Richard O'Keefe
The simplest and best answer is "fashion". In FSharp, > (|>);; val it: ('a -> ('a -> 'b) -> 'b) The ability to turn f x y into y |> f x makes perfect sense in a programming language where Currying (representing a function of n arguments as a function of 1 argument that returns a function of n-1 arg

Re: [R] Pipe operator

2023-01-03 Thread Richard O'Keefe
"Does saving of variables speed up processing" no "or save memory" no. The manual is quite explicit: > ?"|>" ... Currently, pipe operations are implemented as syntax transformations. So an expression written as 'x |> f(y)' is parsed as 'f(x, y)'. Strictly speaking, using |> *doesn't* save any var

Re: [R] Pipe operator

2023-01-03 Thread Richard O'Keefe
This is both true and misleading. The shell pipe operation came from functional programming. In fact the shell pipe operation is NOT "flip apply", which is what |> is, but it is functional composition. That is out = let out = command cmd1 | cmd2 = \x.cmd2(cmd1(x)). Pragmatically, the Unix shell

Re: [R] R Certification

2023-01-04 Thread Richard O'Keefe
I note that Java, for example, has changed a LOT and a certificate from, say, 10 years ago, wouldn't impress me much today. The same can be said of C#, and of R. So the question would be, "what VALUE would a certificate about R provide?" Well, for one thing, it would be a certificate of proficien

Re: [R] return value of {....}

2023-01-10 Thread Richard O'Keefe
I am more than a little puzzled by your question. In the construct {expr1; expr2; expr3} all of the expressions expr1, expr2, and expr3 are evaluated, in that order. That's what curly braces are FOR. When you want some expressions evaluated in a specific order, that's why and when you use curly br

Re: [R] return value of {....}

2023-01-15 Thread Richard O'Keefe
I wonder if the real confusino is not R's scope rules? (begin .) is not Lisp, it's Scheme (a major Lisp dialect), and in Scheme, (begin (define x ...) (define y ...) ...) declares variables x and y that are local to the (begin ...) form, just like Algol 68. That's weirdness 1. Javascript had a si

Re: [R] Plotmath isn't working for special characters

2023-01-24 Thread Richard O'Keefe
plot(1,1, main=quote(x>=y)) produces the symbol for me. plot(1,1, main=parse(text="x>=y")) also produces the symbol. setting value version R version 4.2.2 Patched (2022-11-10 r83330) os Ubuntu 22.04.1 LTS system x86_64, linux-gnu ui X11 language en_NZ:en collate en_NZ.iso88

Re: [R] overlaying two graphs / plots /lines

2023-02-09 Thread Richard O'Keefe
It's easy enough to do this, the question is "what does it MEAN?" If you overlay two graphs, what comparisons will people naturally make, and what do you want them to make? What transformations on the x axis would make two vertically aligned points about the "same" thing? What transformations on th

Re: [R] identify the distribution of the data

2023-02-09 Thread Richard O'Keefe
fitdistrplus is a great package. But the documentation for the fitdist function makes something very clear: fitdistr(data, distr, ...) distr [is] A character string "name" naming a distribution for which the corresponding density function dname, the corresponding distribut

Re: [R] Simple Stacking of Two Columns

2023-04-04 Thread Richard O'Keefe
Just to repeat: you have NamesWide<-data.frame(Name1=c("Tom","Dick"),Name2=c("Larry","Curly")) and you want NamesLong<-data.frame(Names=c("Tom","Dick","Larry","Curly")) There must be something I am missing, because NamesLong <- data.frame(Names = c(NamesWide$Name1, NamesWide$Name2))

Re: [R] on lexical scoping....

2023-04-04 Thread Richard O'Keefe
R *does* search the environment stack. > search() [1] ".GlobalEnv""package:stats" "package:graphics" [4] "package:grDevices" "package:utils" "package:datasets" [7] "package:methods" "Autoloads" "package:base What you seem to be missing is that a package may contain bindi

Re: [R] Matrix scalar operation that saves memory?

2023-04-13 Thread Richard O'Keefe
"wear your disc quite badly"? If you can afford a computer with 512 GB of memory, you can afford to pay $100 for a 2 TB external SSD, use it as scratch space, and throw it away after a month of use. A hard drive is expected to last for more than 40,000 hours of constant use. Are you sure that you

Re: [R] detect and replace outliers by the average

2023-04-21 Thread Richard O'Keefe
This can be seen as three steps: (1) identify outliers (2) replace them with NA (trivial) (3) impute missing values. There are packages for imputing missing data. See https://www.analyticsvidhya.com/blog/2016/03/tutorial-powerful-packages-imputing-missing-values/ Here I just want to address the fi

Re: [R] detect and replace outliers by the average

2023-04-21 Thread Richard O'Keefe
What does it mean when one column is just blank, neither a number nor NA, just nothing? On Fri, 21 Apr 2023 at 07:08, AbouEl-Makarim Aboueissa < abouelmakarim1...@gmail.com> wrote: > Dear All: the attached file in the .txt format > > > > *Re:* detect and replace outliers by the average > > > >

Re: [R] Tying to underdressed the magic of lm redux

2019-06-01 Thread Richard O'Keefe
You can find the names of the columns of a dataframe using colnames(my.df) A dataframe is a value just as much as a number is, and as such, doesn't _have_ a name. However, when you call a function in R, the arguments are not evaluated, and their forms can be recovered, just as "plot" does. In f

Re: [R] Tying to underdressed the magic of lm redux

2019-06-01 Thread Richard O'Keefe
PS: lm records a copy of the call in its result, but has no other use for any name the data frame may have had. On Sun, 2 Jun 2019 at 14:45, Richard O'Keefe wrote: > You can find the names of the columns of a dataframe using > colnames(my.df) > A dataframe is a value just as mu

Re: [R] Open a file which name contains a tilde

2019-06-06 Thread Richard O'Keefe
How can expanding tildes anywhere but the beginning of a file name NOT be considered a bug? On Thu, 6 Jun 2019 at 23:04, Ivan Krylov wrote: > On Wed, 5 Jun 2019 18:07:15 +0200 > Frank Schwidom wrote: > > > +> path.expand("a ~ b") > > [1] "a /home/user b" > > > How can I switch off any file cri

Re: [R] Open a file which name contains a tilde

2019-06-07 Thread Richard O'Keefe
arles wrote: > > > > On Jun 6, 2019, at 2:04 PM, Richard O'Keefe wrote: > > > > How can expanding tildes anywhere but the beginning of a file name NOT be > > considered a bug? > > > > > > I think that that IS what libreadline is doing if one

Re: [R] Merging a dataframe after subsetting with respect to several factors

2019-06-13 Thread Richard O'Keefe
How about just df$time[match(paste(df$a, df$b, df$c), c( "co mb o1", .. "co mb oN"))] On Fri, 14 Jun 2019 at 08:22, Tina Chatterjee wrote: > Hello everyone! > I have the following dataframe(df). > > a<-c("a1","a2","a2","a1","a1","a1") > b<-c("b1","b1","b1","b1","b1","b2") > c<-c(

Re: [R] Add transitivity to a matrix?

2019-06-17 Thread Richard O'Keefe
You have: a square logical matrix M representing a binary relation. You want: a similar matrix representing the least (fewest true cases) transitive relation extending what M represents. It sound as though you are looking for the TRANSITIVE CLOSURE. You will find that in the 'relations' package.

Re: [R] Help with a third ggplot error

2019-06-18 Thread Richard O'Keefe
Nobody else has asked the obvious question: why are the data squashed together like that in the first place? why not modify the process that generates the data so that it does not do that? Jamming things together like that is not common practice with CSV files, so what does the CSV file look lik

Re: [R] Regarding R doubt

2019-06-19 Thread Richard O'Keefe
You did not say what your doubt about R was. PL2.rasch has some class. > class(PL2.rasch) [1] 'Grofnigtz' # or whatever The summary function is really just a dispatcher. > summary.Grofnigtz ... a listing comes out here ... Or you could look in the source code of whatever package you are usin.

Re: [R] Regarding R doubt

2019-06-19 Thread Richard O'Keefe
fficulty and discrimination values directly, I just > want the simple formulas to calculate item difficulty and item > discrimination. > > Also how they have calculated theta(ability) and scores at the backend of > the code. > > > > > > Sent from Mail <https://go.microsoft.

Re: [R] Find the max entry in column 2 - that satisfies a condition given a fixed entry in column 1

2019-06-21 Thread Richard O'Keefe
I have read your message four times in the last couple of days, but I still have very little idea what you want. Let's try some things I've gleaned. You have a matrix with 9 rows and 2 columns, and there is a 2 somewhere in column 1. > m <- matrix(1:18, nrow=9, ncol=2) > m[c(4,7,8),1] <- 2 > m

Re: [R] Output for pasting multiple vectors

2019-06-25 Thread Richard O'Keefe
This has nothing to do with your problem, but given the heavy use of "=" to bind keyword parameters in R, I find the use of "=" for assignment as well confusing. It makes code harder to read than it needs to be. The historic " <- " assignment makes the distinction obvious. On Wed, 26 Jun 2019 at

Re: [R] Looking for R package to extract Concept from text files

2019-06-30 Thread Richard O'Keefe
Are you aware of https://www.tidytextmining.com/ On Mon, 1 Jul 2019 at 16:57, Mehdi Dadkhah wrote: > Thank you!! > Have a nice day! > With best regards, > > On Mon, Jul 1, 2019 at 6:57 AM Abby Spurdle wrote: > > > > > > In parts of these reports, people may state their > > > reasons for do not

Re: [R] Sample size required to estimate population variance

2019-07-02 Thread Richard O'Keefe
Does this help? https://www.r-bloggers.com/computing-sample-size-for-variance-estimation/ On Wed, 3 Jul 2019 at 10:23, Thomas Subia via R-help wrote: > Colleagues, > Can anyone suggest a package or code which might help me calculate the > minimum sample size required to estimate the population v

Re: [R] Matrix - remove [,1] from top row

2019-07-02 Thread Richard O'Keefe
(1) m[,1] is the first column of matrix (or dataframe) m. (2) The first row of matrix or dataframe m is m[1,] (3) To remove the first row of matrix or dataframe m, do m <- m[-1,] On Wed, 3 Jul 2019 at 08:59, Nicola Cecchino wrote: > Hello, > > I am simply trying to remove the [,1] row from

Re: [R] Control the variable order after multiple declarations using within

2019-07-03 Thread Richard O'Keefe
Why not set all the new columns to dummy values to get the order you want and then set them to their final values in the order that works for that? On Thu, 4 Jul 2019 at 00:12, Kevin Thorpe wrote: > > > On Jul 3, 2019, at 3:15 AM, Sebastien Bihorel < > sebastien.biho...@cognigencorp.com> wrote:

Re: [R] Using options(max.print = 1000000) to read in data

2019-07-09 Thread Richard O'Keefe
The obvious question is "what do you mean, FORMATTED AS a matrix?" Once you have read an object into R, you have no information about how it was formatted. Another question is "what do you mean, MATRIX"? Do you mean the kind of R object specifically recognised by is.matrix, or do you mean "rectangu

Re: [R] need help in if else condition

2019-07-10 Thread Richard O'Keefe
Since this has already been answered, I'll just mention one point that was not addressed. > d=c(1,2,3,"-","dnr","post",10) This is rather odd. > str(d) chr [1:7] "1" "2" "3" "-" "dnr" "post" "10" You can create a vector of logical values, or a vector of numbers, or a vector of strings, but if ther

Re: [R] need help in if else condition

2019-07-10 Thread Richard O'Keefe
The answer here is that in "ifelse(a < 3, ..)" you ALWAYS expect "a" to be a vector because there would be no point in using ifelse if it weren't. If you believe that "a" is or ought to be a single number, you write x <- if (a < 3) 1 else 2 The whole point of ifelse is to vectorise. On Thu,

Re: [R] need help in if else condition

2019-07-10 Thread Richard O'Keefe
Expectation: ifelse will use the same "repeat vectors to match the longest" rule that other vectorised functions do. So a <- 1:5 b <- c(2,3) ifelse(a < 3, 1, b) => ifelse(T T F F F <<5>>, 1 <<1>>, 2 3 <<2>>) => ifelse(T T F F F <<5>>, 1 1 1 1 1 <<5>>, 2 3 2 3 2 <<5>>) => 1 1 2 3 2 and that is inde

Re: [R] need help in if else condition

2019-07-12 Thread Richard O'Keefe
to be a quality-of-implementation bug. On Thu, 11 Jul 2019 at 04:14, Dénes Tóth wrote: > > > On 7/10/19 5:54 PM, Richard O'Keefe wrote: > > Expectation: ifelse will use the same "repeat vectors to match the > longest" > > rule that other vectori

Re: [R] need help in if else condition

2019-07-14 Thread Richard O'Keefe
< y; r[ix]<-x[ix]; r[!ix]<-y[!ix]; r}) >user system elapsed > 0.082 0.053 0.135 > > -pd > > > > On 12 Jul 2019, at 15:02 , Richard O'Keefe wrote: > > > > "ifelse is very slow"? Benchmark time. > >> x <- runif(100) >

Re: [R] need help in if else condition

2019-07-16 Thread Richard O'Keefe
ork problem. > > [1] https://cran.r-project.org/bin/linux/ubuntu/README.html > > On July 14, 2019 4:55:25 PM CDT, Richard O'Keefe wrote: > >Four-core AMD E2-7110 running Ubuntu 18.04 LTS. > >The R version is the latest in the repository: > >r-base/bionic,bion

Re: [R] Capturing positive and negative changes using R

2019-07-20 Thread Richard O'Keefe
If "Fardadj was expecting R to recognise the comma as the decimal" then it might be worth mentioning the 'dec = "."' argument of read.table and its friends. On Sun, 21 Jul 2019 at 12:48, Jeff Newmiller wrote: > It is possible that part of the original problem was that Fardadj was > expecting R

Re: [R] How to create a data set from object/data frame?

2019-07-20 Thread Richard O'Keefe
I'm having a little trouble believing what I'm seeing. To the best of my knowledge, sample.info <- data.frame( + spl=paste('A', 1:8, sep=''), + stat=rep(c('cancer' , 'healthy'), each=4)) is not legal R syntax, and R 3.6.1 agrees with me Error: unexpected '=' in "x <- data.frame(+ spl=" Then I see

  1   2   3   >