Re: [R] Pass Parameters to RScript?

2017-10-31 Thread William Michels via R-help
Hello Morcus,

Is your question really about language inter-operability?
If so, have you checked out rJava?

"rJava: Low-Level R to Java Interface"
 https://CRAN.R-project.org/package=rJava
http://www.rforge.net/rJava/

Regards,

Bill.

W. Michels, Ph.D.


On Mon, Oct 30, 2017 at 8:10 AM, Morkus via R-help  wrote:
> Thanks Eric,
>
> I saw that page, too, but it states:
>
> "This post describes how to pass external arguments to R when calling a 
> Rscript with a command line."
>
> Not what I'm trying to do.
>
> Thanks for your reply.
>
> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted email.
>
>>  Original Message 
>> Subject: Re: [R] Pass Parameters to RScript?
>> Local Time: October 30, 2017 9:39 AM
>> UTC Time: October 30, 2017 1:39 PM
>> From: ericjber...@gmail.com
>> To: Morkus 
>> r-help@r-project.org 
>>
>> I did a simple search and got  hits immediately, e.g.
>> https://www.r-bloggers.com/passing-arguments-to-an-r-script-from-command-lines/
>>
>> On Mon, Oct 30, 2017 at 2:30 PM, Morkus via R-help  
>> wrote:
>>
>>> Is it possible to pass parameters to an R Script, say, from Java or other 
>>> language?
>>>
>>> I did some searches, but came up blank.
>>>
>>> Thanks very much in advance,
>>>
>>> Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted email.
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add vectors of unequal length without recycling?

2017-12-13 Thread William Michels via R-help
Maingo,

See previous discussion below on rbind.na() and cbind.na() scripts:

https://stat.ethz.ch/pipermail/r-help/2016-December/443790.html

You might consider binding first then adding orthogonally.
So rbind.na() then colSums(), OR cbind.na() then rowSums().

Best of luck,

W Michels, Ph.D.



On Wed, Dec 13, 2017 at 8:34 AM, William Dunlap via R-help
 wrote:
> Without recycling you would get:
> u <- c(10, 20, 30)
> u + 1
> #[1] 11 20 30
> which would be pretty inconvenient.
>
> (Note that the recycling rule has to make a special case for when one
> argument has length zero - the output then has length zero as well.)
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Dec 12, 2017 at 9:41 PM, Maingo via R-help 
> wrote:
>
>> I'm a newbie for R lang. And I recently came across the "Recycling Rule"
>> when adding two vectors of unequal length.
>>
>> I learned from this tutor [ http://www.r-tutor.com/r-
>> introduction/vector/vector-arithmetics ] that:
>>
>> ""
>>
>> If two vectors are of unequal length, the shorter one will be recycled in
>> order to match the longer vector. For example, the following vectors u and
>> v have different lengths, and their sum is computed by recycling values of
>> the shorter vector u.
>>
>> > u = c(10, 20, 30)
>>
>> > v = c(1, 2, 3, 4, 5, 6, 7, 8, 9)
>>
>> > u + v
>>
>> [1] 11 22 33 14 25 36 17 28 39
>>
>> ""
>>
>> And I wondered, why the shorter vecter u should be recycled? Why not just
>> leave the extra values(4,5,6,7,8,9) in the longer vector untouched by
>> default?
>>
>> Otherwise is it better to have another function that could add vectors
>> without recycling? Right now the recycling feature bugs me a lot.
>>
>> Sent with [ProtonMail](https://protonmail.com) Secure Email.
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drc, ggplot2, and gridExtra

2018-05-22 Thread William Michels via R-help
Hi, I was able to get Eivind's code to work by slight modification of
the "grab" function:

grab <- function() {
grid.echo()
grid.grab()
}

Best Regards,

W. Michels, Ph.D.


On Fri, May 18, 2018 at 9:56 AM, Eivind K. Dovik  wrote:
> On Fri, 18 May 2018, Ed Siefker wrote:
>
>> I have dose response data I have analyzed with the 'drc' package.
>> Using plot() works great.  I want to arrange my plots and source
>> data on a single page.  I think 'gridExtra' is the usual package for
>> this.
>>
>> I could use plot() and par(mfrow=...), but then I can't put the source
>> data table on the page.
>>
>> gridExtra provides grid.table() which makes nice graphical tables. It
>> doesn't work with par(mfrow=...), but has the function grid.arrange()
>> instead.
>>
>> Unfortunately, grid.arrange() doesn't accept plot(). It does work with
>> qplot() from 'ggplot2'.  Unfortunately, qplot() doesn't know how to
>> deal with data of class drc.
>>
>> I'm at a loss on how to proceed here.  Any thoughts?
>>
>
> Hi,
>
> If you grab the plots as grobs, you can arrange them using grid.arrange()
>
> library(gridGraphics)
> library(gridExtra)
>
> grab <- function{
> grid.echo()
> grid.grab()
> }
>
> x <- rnorm(100, 1, 2)
> y <- rnorm(100, 0, 0.5)
>
> plot(x,y)
> p <- grab()
>
> a <- rnorm(20, 0, 1)
> b <- rnorm(20, 1, 2)
>
> plot(a, b)
> q <- grab()
>
> grid.arrange(p, q)
>
>
> Best,
> Eivind K. Dovik
> Bergen, NO
>
>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R examples in Agronomy

2018-06-13 Thread William Michels via R-help
Hello,

For introductory material there is--of course--Immer's Barley Data
(popularized by Bill Cleveland), and used extensively in R to
demonstrate lattice graphics:

>library(lattice)
>?barley

Note the example dotplot() at the bottom of the "barley" help page,
and also on the "barchart" help page. The citation is included, and
there is also other commentary online:

Immer, R. F., H. K. Hayes, and LeRoy Powers. (1934). Statistical
Determination of Barley Varietal Adaptation.  Journal of the American
Society of Agronomy, 26, 403–419.
Wright, Kevin (2013). Revisiting Immer's Barley Data. The American
Statistician, 67(3), 129–133.
http://blog.revolutionanalytics.com/2014/07/theres-no-mistake-in-the-barley-data.html

For a more extensive collection of agronomic data, take a look at the
"agridat" package. There's a nice vignette as well.

https://CRAN.R-project.org/package=agridat
https://cran.r-project.org/web/packages/agridat/vignettes/agridat_examples.pdf

HTH,

Bill.

William Michels, Ph.D.



On Wed, Jun 13, 2018 at 5:56 AM, Khaled Ibrahimi
 wrote:
> Dear all,
> Are there good R stat examples in the field of agronomy (especially field
> experiments)?
> Thanks
> /-
> Khaled IBRAHIMI, PhD
> Assistant Professor, Soil Science & Environment
> Higher Institute of Agricultural Sciences of Chott-Mariem
> The University of Sousse, Tunisia
> Tel.: 216 97 276 835
> Email: ibrahimi.is...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Suggestions for scatter plot of many data

2018-07-19 Thread William Michels via R-help
Hello, In addition to Duncan Mackay's excellent suggestion, I would
recommend Bert Gunter's "stripless" package, for high-density
Trellis-type conditioning plots. See the vignette for examples, and
try out the code for "earthquake" and "barley" plots from the
reference manual.

https://CRAN.R-project.org/package=stripless
https://cran.r-project.org/web/packages/stripless/vignettes/stripless_vignette.html

Sincerely,

W. Michels, Ph.D.

On Thu, Jul 19, 2018 at 6:37 AM, Duncan Mackay  wrote:
> Hi Rich
>
> Try something like this
>
> set.seed(1)
> xy <-
> data.frame(x = rnorm(108),
>y = rnorm(108),
>gp = rep(1:9, ea = 12))
>
>
> xyplot(y~x|gp, xy,
>as.table = TRUE,
>strip = F,
>strip.left = F,
>layout = c(3,3),
>par.settings= list(layout.heights = list(main = 0,
>   axis.top = 0.3),
>   plot.symbol = list(pch = ".",
>  col = "#00",
>  cex = 3)
>),
>scales = list(x = list(alternating = FALSE,
>   relation= "same"),
>  y = list(alternating = FALSE,
>   relation= "same")
>  ),
>panel = function(x,y, ...){
>
>  panel.xyplot(x,y, ...)
>  panel.text(-1, 2, paste("Group", 1:9)[which.packet()])
>
>}
> )
>
> I have put over 60 panels on an A4 page.
> You may have to put an if statement for the group names if they overlap
> data.
> Space is a premium - you can reduce the right margin similar to the top see
> ?trellis.par.get()
>
> Regards
>
> Duncan
>
> Duncan Mackay
> Department of Agronomy and Soil Science
> University of New England
> Armidale NSW 2350
>
>
>
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rich Shepard
> Sent: Thursday, 19 July 2018 06:55
> To: r-help@r-project.org
> Subject: [R] Suggestions for scatter plot of many data
>
>I have daily precipitation data for 58 locations from 2005-01-01 through
> 2018-06-18. Among other plots and analyses I want to apply lattice's
> xyplot() to illustrate the abundance and patterns of the data.
>
>I've used a vector of colors (and a key) when there were only eight
> weather stations and the date range was three months. This was very
> effective in communicating the amounts and patterns.
>
>I'm asking for ideas on how to best present these data in a scatter plot.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
I'm sure there are more efficient ways, but this works:

> test1 <- matrix(runif(50), nrow=10, ncol=5)
> ## test1 <- as.data.frame(test1)
> test1 <- rbind(test1, NA)
> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
> test1


HTH,

Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD  wrote:
>
> Hi R'ers:
> Given a data.frame of five columns and ten rows.
> I would like to take the sum of, say, the first and third columns only.
> For the remaining columns, I do not want any calculations, thus rending their 
> "values" on the "total" row blank. The sum/total row is to be combined to the 
> original data.frame, yielding a data.frame with five columns and eleven rows.
>
> Thanks, in advance.
> Bruce
>
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
Again, you should always copy the R-help list on replies to your OP.

The short answer is you **shouldn't** replace NAs with blanks in your
matrix or dataframe.  NA is the proper designation for those cell
positions. Replacing NA with a "blank" in a dataframe will convert
that column to a "character" mode, precluding further numeric
manipulation of those columns.

Consider your workflow:  are you tying to export a table? If so, take
a look at installing pander (see 'missing' argument on webpage below):

https://cran.r-project.org/web/packages/pander/README.html

Finally, please review the Introductory PDF, available here:

https://cran.r-project.org/doc/manuals/R-intro.pdf

HTH, Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wrote:
> William:
> How can I replace the "NAs" with blanks?
> Bruce
>
> Bruce Ratner, Ph.D.
> The Significant Statistician™
>
>
> William Michels wrote:
>>
>> I'm sure there are more efficient ways, but this works:
>>
>>> test1 <- matrix(runif(50), nrow=10, ncol=5)
>>> ## test1 <- as.data.frame(test1)
>>> test1 <- rbind(test1, NA)
>>> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
>>> test1
>>
>>
>> HTH,
>>
>> Bill.
>>
>> William Michels, Ph.D.
>>
>>
>>
>> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD  wrote:
>>>
>>> Hi R'ers:
>>> Given a data.frame of five columns and ten rows.
>>> I would like to take the sum of, say, the first and third columns only.
>>> For the remaining columns, I do not want any calculations, thus rending
>>> their "values" on the "total" row blank. The sum/total row is to be combined
>>> to the original data.frame, yielding a data.frame with five columns and
>>> eleven rows.
>>>
>>> Thanks, in advance.
>>> Bruce
>>>
>>>
>>> __
>>> Bruce Ratner PhD
>>> The Significant Statistician™
>>>
>>>
>>>
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
Thank you Jeff for pointing out bad spreadsheet practices in R,
seconded by Mathew and Bert.

I should have considered creating a second dataframe ("test1_summary")
to distinguish raw from processed data. Those who want to address
memory issues caused by unnecessary duplication, feel free to chime
in.

Finally, thank you Bert for your most informative post on adding
attributes to dataframes. I really learned a lot!

Best Regards,

Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 4:59 PM, Bert Gunter  wrote:
> All:
>
> 1. I agree wholeheartedly with prior responses.
>
> 2. But let's suppose that for some reason, you *did* want to carry
> around some "calculated values" with the data frame. Then one way to
> do it is to add them as attributes to the data frame. This way they
> cannot "pollute" the data in the way Jeff warned against; e.g.
>
> attr(your_frame,"colsums") <- colSums(your_frame)
>
> This of course calculates them all, but you can of course just attach
> some (e.g. colSums(your_frame[,c(1,3)] )
>
> 3. This, of course, has the disadvantage of requiring recalculation of
> the attribute if the data changes, which is an invitation to problems.
> A better approach might be to attach the *function* that does the
> calculation as an attribute, which when invoked always uses the
> current data:
>
> attr(your_frame,"colsums") <- function(x)colSums(x)
>
> For example:
>
> df <- data.frame(x=1:5,y=21:25)
> attr(df,"colsums")<- function(x)colSums(x)
>
> ## then:
>> attr(df,"colsums")(df)
>   x   y
>  15 115
>
> ## add a row
>> df[6,] <- rep(100,2)
>> attr(df,"colsums")(df)
>   x   y
> 115 215
>
>
> This survives changing the name of df:
>
>> dat <- df
>> attr(dat,"colsums")(dat)
>   x   y
> 115 215
>
> As it stands, the call: attr(df,"colsums")(df)  is a bit clumsy; one
> could easily write a function that does this sort of thing more
> cleanly, as, for example, is done via the "selfStart" functionality
> for nonlinear models.
>
> But all this presupposes that the OP is familiar with R programming
> paradigms, especially the use of functions as first class objects, and
> the language in general. While I may have missed this, his posts do
> not seem to me to indicate such familiarity, so as others have
> suggested, perhaps the best answer is to first spend some time with an
> R tutorial or two and *not* try to mimic bad spreadsheet practices in
> R.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Mar 31, 2017 at 2:49 PM, Jeff Newmiller
>  wrote:
>> You can also look at the knitr-RMarkdown work flow, or the knitr-latex work 
>> flow. In both of these it is reasonable to convert your data frame to a 
>> temporary character-only form purely for output purposes. However, one can 
>> usually use an existing function to push your results out without damaging 
>> your working data.
>>
>> It is important to separate your data from your output because mixing 
>> results (totals) with data makes using the data further extremely difficult. 
>> Mixing them is one of the major flaws of the spreadsheet model of 
>> computation, and it causes problems there as well as in R.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 31, 2017 1:05:09 PM PDT, William Michels via R-help 
>>  wrote:
>>>Again, you should always copy the R-help list on replies to your OP.
>>>
>>>The short answer is you **shouldn't** replace NAs with blanks in your
>>>matrix or dataframe.  NA is the proper designation for those cell
>>>positions. Replacing NA with a "blank" in a dataframe will convert
>>>that column to a "character" mode, precluding further numeric
>>>manipulation of those columns.
>>>
>>>Consider your workflow:  are you tying to export a table? If so, take
>>>a look at installing pander (see 'missing' argument on webpage below):
>>>
>>>https://cran.r-project.org/web/packages/pander/README.html
>>>
>>>Finally, please review the Introductory PDF, available here:
>>>
>>>https://cran.r-project.org/doc/manuals/R-intro.pdf
>>>
>>>HTH, Bill.
>>>
>>>William Michels, Ph.D.
>>>
>>>
>>>
>>>On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wr

Re: [R] problems in vectors of dates_times

2017-04-07 Thread William Michels via R-help
I believe the lubridate package does a good job with time zones.

> install.packages("lubridate")
> library(lubridate)


Look at the supplied functions  with_tz()  and  force_tz().

HTH,

Bill.

William J. Michels, Ph.D.



On Fri, Apr 7, 2017 at 12:52 AM, Jeff Newmiller
 wrote:
> R does a poor job of supporting timezone-specific objects... you have to 
> transfer the necessary attributes explicitly for many operations.  (It does 
> no job of supporting element-specific timezones so don't go there.)
>
> The good news is that R is pretty good at working with points in time, since 
> the default behavior of implementing time with numeric values in GMT always 
> means you can specify whatever timezone you want input or output to use, and 
> the timestamps are always ordered correctly in time.
>
> I find that using the default empty string for tz attributes on POSIXt 
> objects (meaning use whatever is default) and letting the TZ environment 
> variable control the "current default" timezone is the most effective way to 
> avoid frustration with this. Don't hesitate to change that variable when you 
> need to convert to or from character or POSIXlt..
>
> Sys.setenv( TZ="Etc/GMT+5" ) # read ?Olson
> x <- as.POSIXct( "2017-03-31 19:00:00" )
> Sys.setenv( TZ="Etc/GMT+8" )
> y <- as.POSIXct( "2017-03-31 16:00:00" )
> Sys.setenv( TZ="GMT" )
> print( x )
> print( y )
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 7, 2017 12:00:52 AM PDT, Troels Ring  wrote:
>>Thanks a  lot - perhaps it is just understanding how times dates are
>>handled, sorry to bother if that is just the case
>>
>>C[1]==A[1]  # TRUE
>>
>>but
>>
>>C[1]
>>[1] "2013-03-28 07:00:00 CET"
>>A[1]
>>[1] "2013-03-28 06:00:00 UTC"
>>
>>
>>
>>
>>
>>Den 07-04-2017 kl. 08:27 skrev Ulrik Stervbo:
>>> Hi Troels,
>>>
>>> I get no error. I think we need more information to be of any help.
>>>
>>> Best wishes,
>>> Ulrik
>>>
>>> On Fri, 7 Apr 2017 at 08:17 Troels Ring  wrote:
>>>
 Dear friends - I have further problems  handling dates_times, as
 demonstrated below where concatenating two formatted vectors of
 date_times results in errors.
 I wonder why this happens and what was wrong in trying to take these
>>two
 vectors together
 All best wishes
 Troels Ring
 Aalborg, Denmark
 Windows
 R version 3.3.2 (2016-10-31)


 A <- structure(c(1364450400, 1364450400, 1364536800, 1364623200,
 1364709600,
 1364796000, 1364882400, 1364968800, 1365055200, 1365141600,
>>1365228000,
 1365314400, 1365400800), class = c("POSIXct", "POSIXt"), tzone =
>>"UTC")
 A
 B <- structure(c(1365141600, 1365228000, 1365314400, 1365400800,
 1365487200,
 1365573600, 136566, 1365746400, 1365832800, 1365919200,
>>1366005600,
 1366092000), class = c("POSIXct", "POSIXt"), tzone = "UTC")
 B
 C <- c(A,B)
 C

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>  [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a way to get R script line number

2017-04-07 Thread William Michels via R-help
Hi Brad,

Some of the debugging functions may be of use. You can look at trace()
or setBreakpoint(). But I believe Bert is correct in saying your
concept of a "Line Number" and R's concept of a "Line Number" will
differ.

Finally, you can look at the function findLineNum(), which can be
called external to your source code file (not embedded as in your
example).

>?debug

HTH,

Bill

William J. Michels, Ph.D.



On Thu, Apr 6, 2017 at 12:30 PM, Bert Gunter  wrote:
> I believe the answer is: No. "Line number" is an ambiguous concept.
> Does it mean physical line on a display of a given width? a line of
> code demarcated by e.g.  ; a step in the execution of script (that
> might display over several physical lines?)
>
> However, various IDE's have and display "line numbers," so you might
> try researching whichever one that you use.
>
> (Note: Correction/clarification requested if I am wrong on this).
>
> -- Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Apr 6, 2017 at 11:59 AM, Brad P  wrote:
>> Hello,
>>
>> Is there a way to get the current line number in an R script?
>>
>> As a silly example, if I have the following script and a function called
>> getLineNumber (suppose one exists!), then the result would be 3.
>>
>> 1 # This is start of script
>> 2
>> 3 print( getLineNumber() )
>> 4
>> 5 # End of script
>>
>> Thanks for any ideas!
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Too strange that I cannot install several packages

2017-04-10 Thread William Michels via R-help
For a base-R installation, you can print out multiple help pages
(function indices) like so:

> for(i in 1:length(sessionInfo()$basePkgs)) {
print(library(help = sessionInfo()$basePkgs[i], character.only = TRUE)) }

HTH,

Bill.

William Michels, Ph.D.



On Mon, Apr 10, 2017 at 11:15 AM, Doran, Harold  wrote:
> I did answer this question quite a few weeks ago. You then continued to email 
> me directly off list asking the exact question you posted below and at that 
> time I gave you the answer on how to solve.
>
> Your unwillingness to do even the basic study on R clogs this list 
> unnecessarily.
>
> -Original Message-
> From: Bruce Ratner PhD [mailto:b...@dmstat1.com]
> Sent: Monday, April 10, 2017 2:13 PM
> To: Doran, Harold 
> Cc: r-help@r-project.org
> Subject: Re: [R] Too strange that I cannot install several packages
>
> Dear Harold:
> If you do not want to answer my questions, then do not reply.
> Warmest Regards,
> Bruce
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
> (516) 791-3544
> Statistical Predictive Analytics -- www.DMSTAT1.com Machine-Learning Data 
> Mining -- www.GenIQ.net
>
>
>
>> On Apr 10, 2017, at 1:11 PM, Doran, Harold  wrote:
>>
>> You really need to stop spamming this list and take time to learn R basics. 
>> You sent me emails directly on this and asked me this specific question 
>> before.
>>
>> These are not packages, but are functions and you do not work with R this 
>> way.
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
>> BR_email
>> Sent: Monday, April 10, 2017 1:08 PM
>> To: r-help@r-project.org
>> Subject: [R] Too strange that I cannot install several packages
>>
>> Hi Rers:
>> Is there anything I can check for as to why I cannot install several 
>> packages, i.e., sample, resample, resample_bootstrap, apply, sapply, ...?
>>
>> The error message I get is:
>>
>>> install.packages("sample") Installing package into
>> ‘C:/Users/BruceRatner/Documents/R/win-library/3.3’ (as ‘lib’ is
>> unspecified) Warning in install.packages :
>>   package ‘sample’ is not available (for R version 3.3.3)
>>
>>
>> Thanks.
>> Bruce
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating interactive graphs and exporting to Intranet site

2017-04-23 Thread William Michels via R-help
Hi Chris (and Sarah),

Chris you've listed a lot of restrictions, but I just wanted to
mention Jeroen Ooms' work developing OpenCPU:

"The OpenCPU system exposes an http API for embedded scientific
computing with R. The server can run either as a single-user
development server within the interactive R session, or as a
multi-user linux stack based on rApache and NGINX."

https://www.opencpu.org
https://www.opencpu.org/apps.html
https://github.com/jeroen/opencpu

You may be able to figure out a way to work around (some of) your
restrictions using opencpu.js, the OpenCPU JavaScript client library.

HTH,

Bill

William Michels, Ph.D.


On Sun, Apr 23, 2017 at 2:26 PM, Sarah Goslee  wrote:
> HI Chris,
>
> You can use the R plotly library on your own computer to create the
> interactive graph, then upload the code to the server.
>
> I have a setup like that where real-time data is processed every hour,
> a Rmarkdown file is rendered using the R plotly package to make the
> graphs, and then the resulting html file is copied to the server.
>
> It sounds like exactly what you need.
>
> Sarah
>
> On Sat, Apr 22, 2017 at 10:29 PM, Chris Battiston
>  wrote:
>> Good evening,
>>
>> I’m relatively new to using R and am trying to find a way to create a series 
>> of interconnected graphs where I have a filter (either a drop down or series 
>> of checkboxes) where when an option is selected, all graphs are updated to 
>> show that group’s data.  I need to keep these graphs internal to our 
>> organization, so can’t use Shiny etc.; I am also unable to run R or other 
>> products on my server (company policy).  So, basically what I’m trying to do 
>> is create the dashboard on my desktop, export the HTML or whatever files, 
>> and post those to the Intranet.  I have tried ggvis, iplots, and a variety 
>> of other packages but I cannot seem to get them to work as i need them to.
>>
>> Any suggestions?  I need to present my proposed plan to the directors on 
>> Wednesday and really don’t want to use Excel for the graphs - I want this to 
>> be intuitive for them, but ensuring that the report is easily maintained and 
>> more flexible than a Pivot Table.
>>
>> Thanks so much for your time and have a good evening
>> Chris
>
>
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install lapack for mac

2017-05-02 Thread William Michels via R-help
Have you tried R-GUI, in the R-distribution available below?

https://cran.r-project.org/bin/macosx/

Here's a similar question on SO:

http://stackoverflow.com/questions/13476736/r-lapack-routines-cannot-be-loaded

HTH, Bill.

William Michels, Ph.D.


On Tue, May 2, 2017 at 11:51 AM, Assa Yeroslaviz  wrote:
> Hi,
>
> I am running R under Rstudio for the analysis of single-cell RNA-Seq data.
>
> When trying to analyse some data I keep getting the message
>
>> slicer_traj_lle <- lle(t(deng[slicer_genes,]), m = 2, k)$Y
> finding neighbours
> calculating weights
> Error in eigen(G, symmetric = TRUE, only.values = TRUE) :
>   LAPACK routines cannot be loaded
>
>
> After searching the net I couldn't find a way to install LAPACK on a mac.
> When trying to install it via homebrew I get this message (this is a
> shorted message. the complete message is below):
>
> brew install lapack
>
> ...
>
> macOS already provides this software and installing another version in
>
> parallel can cause all kinds of trouble.
>
> For compilers to find this software you may need to set:
>
> LDFLAGS:  -L/usr/local/opt/lapack/lib
>
> CPPFLAGS: -I/usr/local/opt/lapack/include
>
> For pkg-config to find this software you may need to set:
>
> PKG_CONFIG_PATH: /usr/local/opt/lapack/lib/pkgconfig
>
>
> I truely don't understand why my mac can't find the lapack, if it already
> there and i still can't figure out, how to tell my R tool / RStudio, where
> to find lapack.
>
> Does anyone has an idea how to do this?
>
> thanks
>
> Assa
>> R.version
>_
> platform   x86_64-apple-darwin13.4.0
> arch   x86_64
> os darwin13.4.0
> system x86_64, darwin13.4.0
> status
> major  3
> minor  3.1
> year   2016
> month  06
> day21
> svn rev70800
> language   R
> version.string R version 3.3.1 (2016-06-21)
> nickname   Bug in Your Hair
>
>
>> sessionInfo()
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.12.4 (Sierra)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
>  [1] splines   stats4parallel  stats graphics  grDevices utils
> datasets  methods
> [10] base
>
> other attached packages:
>  [1] R.utils_2.5.0R.oo_1.21.0  R.methodsS3_1.7.1
>  Seurat_1.4.0.14
>  [5] cowplot_0.7.0SLICER_0.2.0 alphahull_2.1lle_1.1
>
>  [9] snowfall_1.84-6.1snow_0.4-2   MASS_7.3-47
>  scatterplot3d_0.3-40
> [13] igraph_1.0.1 destiny_2.0.8monocle_2.2.0
>  DDRTree_0.1.4
> [17] irlba_2.1.2  VGAM_1.0-3   ggplot2_2.2.1
>  Biobase_2.34.0
> [21] BiocGenerics_0.20.0  Matrix_1.2-8 M3Drop_1.0.0
> numDeriv_2016.8-1
> [25] TSCAN_1.12.0
>
> loaded via a namespace (and not attached):
>   [1] shinydashboard_0.5.3   lme4_1.1-13RSQLite_1.1-2
>   [4] AnnotationDbi_1.36.2   htmlwidgets_0.8grid_3.3.1
>   [7] combinat_0.0-8 trimcluster_0.1-2  ranger_0.7.0
>  [10] Rtsne_0.13 munsell_0.4.3  codetools_0.2-15
>  [13] statmod_1.4.29 colorspace_1.3-2   fastICA_1.2-0
>  [16] knitr_1.15.1   ROCR_1.0-7 robustbase_0.92-7
>  [19] vcd_1.4-3  tensor_1.5 VIM_4.7.0
>  [22] TTR_0.23-1 lars_1.2   slam_0.1-40
>  [25] splancs_2.01-40bbmle_1.0.19   mnormt_1.5-5
>  [28] polyclip_1.6-1 pheatmap_1.0.8 rprojroot_1.2
>  [31] diptest_0.75-7 R6_2.2.0   RcppEigen_0.3.2.9.1
>  [34] flexmix_2.3-14 bitops_1.0-6   spatstat.utils_1.4-1
>  [37] assertthat_0.2.0   scales_0.4.1   nnet_7.3-12
>  [40] gtable_0.2.0   goftest_1.1-1  MatrixModels_0.4-1
>  [43] lazyeval_0.2.0 ModelMetrics_1.1.0 acepack_1.4.1
>  [46] checkmate_1.8.2reshape2_1.4.2 abind_1.4-5
>  [49] backports_1.0.5httpuv_1.3.3   rsconnect_0.7
>  [52] Hmisc_4.0-2caret_6.0-76   tools_3.3.1
>  [55] gplots_3.0.1   RColorBrewer_1.1-2 proxy_0.4-16
>  [58] Rcpp_0.12.10   plyr_1.8.4 base64enc_0.1-3
>  [61] RCurl_1.95-4.8 rpart_4.1-11   deldir_0.1-14
>  [64] pbapply_1.3-2  viridis_0.4.0  S4Vectors_0.12.2
>  [67] zoo_1.8-0  cluster_2.0.6  magrittr_1.5
>  [70] data.table_1.10.4  SparseM_1.77   lmtest_0.9-35
>  [73] mvtnorm_1.0-6  matrixStats_0.52.2 mime_0.5
>  [76] evaluate_0.10  xtable_1.8-2   smoother_1.1
>  [79] pbkrtest_0.4-7 XML_3.98-1.6   mclust_5.2.3
>  [82] IRanges_2.8.2  gridExtra_2.2.1HSMMSingleCell_0.108.0
>  [85] biomaRt_2.30.0 tibble_1.3.0   KernSmooth_2.23-15
>  [88] minqa_1.2.4htmltools_0.3.6segmented_0.5-1.4
>  [91] mgcv_1.8-17Formula_1.2-1  tclust

Re: [R] About calculating average values from several matrices

2017-05-09 Thread William Michels via R-help
Dear Lily,

Harold is telling you to type "?round" at the R command prompt to pull
up the "round" help page.

>?round
>help("round")

AFAIK, the above two commands are equivalent, in general.

Best, Bill.

W. Michels, Ph.D.



On Tue, May 9, 2017 at 8:11 AM, Doran, Harold  wrote:
> ?round
>
>
> From: lily li [mailto:chocol...@gmail.com]
> Sent: Tuesday, May 09, 2017 11:10 AM
> To: Charles Determan 
> Cc: Doran, Harold ; R mailing list 
> Subject: Re: [R] About calculating average values from several matrices
>
> Thanks very much, it works. But how to round the values to have only 1 
> decimal digit or 2 decimal digits? I think by dividing, the values are double 
> type now. Thanks again.
>
>
> On Tue, May 9, 2017 at 9:04 AM, Charles Determan 
> mailto:cdeterma...@gmail.com>> wrote:
> If you want the mean of each element across you list of matrices the 
> following should provide what you are looking for where Reduce sums all your 
> matrix elements across matrices and the simply divided my the number of 
> matrices for the element-wise mean.
>
> Reduce(`+`, mylist)/length(mylist)
> Regards,
> Charles
>
> On Tue, May 9, 2017 at 9:52 AM, lily li 
> mailto:chocol...@gmail.com>> wrote:
> I meant for each cell, it takes the average from other dataframes at the
> same cell. I don't know how to deal with row names and col names though, so
> it has the error message.
>
> On Tue, May 9, 2017 at 8:50 AM, Doran, Harold 
> mailto:hdo...@air.org>> wrote:
>
>> It’s not clear to me what your actual structure is. Can you provide
>> str(object)? Assuming it is a list, and you want the mean over all cells or
>> columns, you might want like this:
>>
>>
>>
>> myData <- vector("list", 3)
>>
>>
>>
>> for(i in 1:3){
>>
>> myData[[i]] <- matrix(rnorm(100), 10, 10)
>>
>> }
>>
>>
>>
>> ### mean over all cells
>>
>> sapply(myData, function(x) mean(x))
>>
>>
>>
>> ### mean over all columns
>>
>> sapply(myData, function(x) colMeans(x))
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* lily li [mailto:chocol...@gmail.com]
>> *Sent:* Tuesday, May 09, 2017 10:44 AM
>> *To:* Doran, Harold mailto:hdo...@air.org>>
>> *Cc:* R mailing list mailto:r-help@r-project.org>>
>> *Subject:* Re: [R] About calculating average values from several matrices
>
>>
>>
>>
>> I'm trying to get a new dataframe or whatever to call, which has the same
>> structure with each file as listed above. For each cell in the new
>> dataframe or the new file, it is the average value from former dataframes
>> at the same location. Thanks.
>>
>>
>>
>> On Tue, May 9, 2017 at 8:41 AM, Doran, Harold 
>> mailto:hdo...@air.org>> wrote:
>>
>> Are you trying to take the mean over all cells, or over rows/columns
>> within each dataframe. Also, are these different dataframes stored within a
>> list or are they standalone?
>>
>>
>>
>>
>> -Original Message-
>> From: R-help 
>> [mailto:r-help-boun...@r-project.org] 
>> On Behalf Of lily li
>> Sent: Tuesday, May 09, 2017 10:39 AM
>> To: R mailing list mailto:r-help@r-project.org>>
>> Subject: [R] About calculating average values from several matrices
>>
>> Hi R users,
>>
>> I have a question about manipulating the data.
>> For example, there are several such data frames or matrices, and I want to
>> calculate the average value from all the data frames or matrices. How to do
>> it? Also, should I convert them to data frame or matrix first? Right now,
>> when I use typeof() function, each one is a list.
>>
>> file1
>> jan   feb   mar   apr   may   jun   jul   aug   sep   oct   nov
>>
>> app1   1.1   1.20.80.9   1.31.5   2.2   3.2   3.01.2   1.1
>> app2   3.1   3.22.82.5   2.32.5   3.2   3.0   2.91.8   1.8
>> app3   5.1   5.23.84.9   5.35.5   5.2   4.2   5.04.2   4.1
>>
>> file2
>> jan   feb   mar   apr   may   jun   jul   aug   sep   oct   nov
>>
>> app1   1.9   1.50.50.9   1.21.8   2.5   3.7   3.21.5   1.6
>> app2   3.5   3.72.32.2   2.52.0   3.6   3.2   2.81.2   1.4
>> app3   5.5   5.03.54.4   5.45.6   5.3   4.4   5.24.3   4.2
>>
>> file3 has the similar structure and values...
>>
>> There are eight such files, and when I use the function mean(file1, file2,
>> file3, ..., file8), it returns the error below. Thanks for your help.
>>
>> Warning message:
>> In mean.default(file1, file2, file3, file4, file5, file6, file7,  :
>>   argument is not numeric or logical: returning NA
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To 
>> UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
> [[alternative HTML version d

Re: [R] [FORGED] Logical Operators' inconsistent Behavior

2017-05-21 Thread William Michels via R-help
Looking below and online, R's truth tables for NOT, AND, OR are
identical to the NOT, AND, OR truth tables originating from Stephen
Cole Kleene's "strong logic of indeterminacy", as demonstrated on the
Wikipedia page entitled, "Three-Valued Logic"--specifically in the
section entitled "Kleene and Priest Logics":

https://en.wikipedia.org/wiki/Three-valued_logic#Kleene_and_Priest_logics

>
> ttNOT <- cbind(c(FALSE, NA, TRUE), !c(FALSE, NA, TRUE))
> rownames(ttNOT) <- c("False", "na", "True")
> colnames(ttNOT) <- c("A", "Not(A)")
> ttNOT
  A Not(A)
False FALSE   TRUE
na   NA NA
True   TRUE  FALSE
>
> ttAND <- outer(c(FALSE, NA, TRUE), c(FALSE, NA, TRUE), "&" )
> rownames(ttAND) <- c("False", "na", "True")
> colnames(ttAND) <- c("False", "na", "True")
> ttAND
  Falsena  True
False FALSE FALSE FALSE
naFALSENANA
True  FALSENA  TRUE
>
> ttOR <- outer(c(FALSE, NA, TRUE), c(FALSE, NA, TRUE), "|" )
> rownames(ttOR) <- c("False", "na", "True")
> colnames(ttOR) <- c("False", "na", "True")
> ttOR
  False   na True
False FALSE   NA TRUE
na   NA   NA TRUE
True   TRUE TRUE TRUE
>
>

The bottom section of the same Wikipedia page (section entitled
"Application in SQL" ), and an additional Wikipedia page entitled
"Null (SQL)" discusses how the Kleene logic described above is
differentially implemented in SQL.

https://en.wikipedia.org/wiki/Null_(SQL)

HTH,

Bill

William Michels, Ph.D.




On Sun, May 21, 2017 at 7:00 AM, Hadley Wickham  wrote:
> On Fri, May 19, 2017 at 6:38 AM, S Ellison  wrote:
>>> TRUE & FALSE is FALSE but TRUE & TRUE is TRUE, so TRUE & NA could be
>>> either TRUE or FALSE and consequently is NA.
>>>
>>> OTOH FALSE & (anything) is FALSE so FALSE & NA is FALSE.
>>>
>>> As I said *think* about it; don't just go with your immediate knee-jerk
>>> (simplistic) reaction.
>>
>> Hmm... not sure that was quite fair to the OP. Yes,  FALSE &  == 
>> FALSE. But 'NA' does not mean 'anything'; it means 'missing' (see ?'NA'). It 
>> is much less obvious that FALSE &  should generate a non-missing 
>> value. SQL, for example, generally  takes the view that any expression 
>> involving 'missing' is 'missing'.
>
> That's not TRUE ;)
>
> sqlite> select (3 > 2) OR NULL;
> 1
>
> sqlite> select (4 < 3) AND NULL;
> 0
>
> Hadley
>
>
> --
> http://hadley.nz
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Logical Operators' inconsistent Behavior

2017-05-22 Thread William Michels via R-help
Evaluation of the NOT, AND, OR logical statements below in MySQL
5.5.30-log Community Server (GPL) replicate R's truth tables for NOT,
AND, OR. See MySQL queries (below), which are in agreement with R
truth table code posted in this thread:


bash-3.2$ mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 346
Server version: 5.5.30-log MySQL Community Server (GPL)
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT FALSE, NULL, TRUE;
+---+--+--+
| FALSE | NULL | TRUE |
+---+--+--+
| 0 | NULL |1 |
+---+--+--+
1 row in set (0.00 sec)

mysql> SELECT NOT FALSE, NOT NULL, NOT TRUE;
+---+--+--+
| NOT FALSE | NOT NULL | NOT TRUE |
+---+--+--+
| 1 | NULL |0 |
+---+--+--+
1 row in set (0.00 sec)

mysql> SELECT FALSE AND FALSE,
-> FALSE AND NULL,
-> FALSE AND TRUE;
+-+++
| FALSE AND FALSE | FALSE AND NULL | FALSE AND TRUE |
+-+++
|   0 |  0 |  0 |
+-+++
1 row in set (0.00 sec)

mysql> SELECT NULL AND NULL,
-> NULL AND TRUE,
-> TRUE AND TRUE;
+---+---+---+
| NULL AND NULL | NULL AND TRUE | TRUE AND TRUE |
+---+---+---+
|  NULL |  NULL | 1 |
+---+---+---+
1 row in set (0.00 sec)

mysql> SELECT TRUE OR TRUE,
-> NULL OR TRUE,
-> FALSE OR TRUE;
+--+--+---+
| TRUE OR TRUE | NULL OR TRUE | FALSE OR TRUE |
+--+--+---+
|1 |1 | 1 |
+--+--+---+
1 row in set (0.00 sec)

mysql> SELECT NULL OR NULL,
-> FALSE OR NULL,
-> FALSE OR FALSE;
+--+---++
| NULL OR NULL | FALSE OR NULL | FALSE OR FALSE |
+--+---++
| NULL |  NULL |  0 |
+--+---++
1 row in set (0.00 sec)

mysql>


HTH,

Bill

William Michels, Ph.D.


On Sun, May 21, 2017 at 7:00 AM, Hadley Wickham  wrote:
> On Fri, May 19, 2017 at 6:38 AM, S Ellison  wrote:
>>> TRUE & FALSE is FALSE but TRUE & TRUE is TRUE, so TRUE & NA could be
>>> either TRUE or FALSE and consequently is NA.
>>>
>>> OTOH FALSE & (anything) is FALSE so FALSE & NA is FALSE.
>>>
>>> As I said *think* about it; don't just go with your immediate knee-jerk
>>> (simplistic) reaction.
>>
>> Hmm... not sure that was quite fair to the OP. Yes,  FALSE &  == 
>> FALSE. But 'NA' does not mean 'anything'; it means 'missing' (see ?'NA'). It 
>> is much less obvious that FALSE &  should generate a non-missing 
>> value. SQL, for example, generally  takes the view that any expression 
>> involving 'missing' is 'missing'.
>
> That's not TRUE ;)
>
> sqlite> select (3 > 2) OR NULL;
> 1
>
> sqlite> select (4 < 3) AND NULL;
> 0
>
> Hadley
>
>
> --
> http://hadley.nz
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About change columns and specific rows in R

2017-05-22 Thread William Michels via R-help
Hi Lily,

You're on the right track, but you should define a new column first
(filled with NA values), then specify the precise rows and columns on
both the left and right hand sides of the assignment operator that
will be altered. Luckily, this is pretty easy...just remember to use
which() on the right hand side of the assignment operator to get the
index of the rows you want. Example below for "product1":

> DF$product1_1 <- NA
> DF[DF$month == 1, "product1_1"] <- DF[which(DF$month == 1), "product1"]*3.1
>


HTH,

Bill.

William Michels, Ph.D.




On Mon, May 22, 2017 at 9:56 PM, lily li  wrote:
> Hi R users,
>
> I have a question about manipulating the dataframe. I want to create a new
> dataframe, and to multiply rows with different seasons for different
> constants.
>
> DF
> year   month   day   product1   product2   product3
> 1981 1  1 18  5620
> 1981 1  2 19  4522
> 1981 1  3 16  4828
> 1981 1  4 19  5021
> 1981 2  1 17  4925
> 1981 2  2 20  4723
> 1981 2  3 21  5227
>
> For example, how to multiply product1 in month1 by 3.1, and to multiply
> product3 in month2 by 2.0? I wrote the code like this but does not work.
> Thanks for your help.
>
> DF['month'==1, ]$product1_1 = DF['month'==1, ]$product1 * 3.1;
> DF['month'==2, ]$product3_1 = DF['month'==1, ]$product3 * 2.0;
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About change columns and specific rows in R

2017-05-23 Thread William Michels via R-help
Hi Ivan,

I was just writing a follow-up note as your note came in. While the
code I posted previously works fine, using which() is unnecessary.

> DF <- read.csv("~/lily.csv")
> DF$product1_1 <- NA
> DF$product1_1 <- DF[DF$month == 1, "product1"]*3.1
Error in `$<-.data.frame`(`*tmp*`, "product1_1", value = c(55.8, 58.9,  :
  replacement has 4 rows, data has 7
> DF$product1_1 <- DF[which(DF$month == 1), "product1"]*3.1
Error in `$<-.data.frame`(`*tmp*`, "product1_1", value = c(55.8, 58.9,  :
  replacement has 4 rows, data has 7
> DF[DF$month == 1, "product1_1"] <- DF[DF$month == 1, "product1"]*3.1
> DF
  year month day product1 product2 product3 product1_1
1 1981 1   1   18   56   20   55.8
2 1981 1   2   19   45   22   58.9
3 1981 1   3   16   48   28   49.6
4 1981 1   4   19   50   21   58.9
5 1981 2   1   17   49   25 NA
6 1981 2   2   20   47   23 NA
7 1981 2   3   21   52   27 NA
>

The two errors above were caused because I failed to specify rows on
the left hand side of the assignment operator, not because I failed to
use which(). Once rows and columns are specified on both sides, the
assignment works fine.

(I do however, prefer to create an "NA" column first. Personal preference ;-).

Best Regards,

Bill.



On Mon, May 22, 2017 at 11:57 PM, Ivan Calandra  wrote:
> Hi,
>
> Actually, you don't need to create the column first, nor to use which:
> DF[DF$month==1, "product1_1"] = DF[DF$month==1, "product1"] * 3.1
>
> The "[" is a great tool that you need to learn. In this case, you don't need
> to combine "[" and $: within the square brackets, the vector before the
> comma indexes the rows and the one after the comma indexes the columns.
>
> The other thing you were missing correctly referencing the rows. You have to
> specify the data.frame you want to look into.
>
> And last, learn to use dput() to provide a nice reproducible example.
>
> HTH,
> Ivan
>
>
> --
> Dr. Ivan Calandra
> TraCEr, Laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> On 23/05/2017 08:45, William Michels via R-help wrote:
>>
>> Hi Lily,
>>
>> You're on the right track, but you should define a new column first
>> (filled with NA values), then specify the precise rows and columns on
>> both the left and right hand sides of the assignment operator that
>> will be altered. Luckily, this is pretty easy...just remember to use
>> which() on the right hand side of the assignment operator to get the
>> index of the rows you want. Example below for "product1":
>>
>>> DF$product1_1 <- NA
>>> DF[DF$month == 1, "product1_1"] <- DF[which(DF$month == 1),
>>> "product1"]*3.1
>>>
>>
>> HTH,
>>
>> Bill.
>>
>> William Michels, Ph.D.
>>
>>
>>
>>
>> On Mon, May 22, 2017 at 9:56 PM, lily li  wrote:
>>>
>>> Hi R users,
>>>
>>> I have a question about manipulating the dataframe. I want to create a
>>> new
>>> dataframe, and to multiply rows with different seasons for different
>>> constants.
>>>
>>> DF
>>> year   month   day   product1   product2   product3
>>> 1981 1  1 18  5620
>>> 1981 1  2 19  4522
>>> 1981 1  3 16  4828
>>> 1981 1  4 19  5021
>>> 1981 2  1 17  4925
>>> 1981 2  2 20  4723
>>> 1981 2  3 21  5227
>>>
>>> For example, how to multiply product1 in month1 by 3.1, and to multiply
>>> product3 in month2 by 2.0? I wrote the code like this but does not work.
>>> Thanks for your help.
>>>
>>> DF['month'==1, ]$product1_1 = DF['month'==1, ]$product1 * 3.1;
>>> DF['month'==2, ]$product3_1 = DF['month'==1, ]$product3 * 2.0;
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> __
>>>

Re: [R] write.dna command

2017-06-17 Thread William Michels via R-help
We'll need more information on the packages you're using. Can you post the
output of:

> sessionInfo()

Finally, is this a Bioconductor question? They have their own support site:

https://support.bioconductor.org

HTH,

William Michels, Ph.D.


On Sat, Jun 17, 2017 at 11:05 AM, Jeff Newmiller 
wrote:

> I suspect you meant
>
> WD <- "~/Documents/Scripting/R_Studio/Sequences/"
>
> but I am entirely unfamiliar with the packages you are using, and know
> nothing about what is on your hard drive.
>
> For future reference:
>
> A) Read the Posting Guide.  This is a plain text email list, and your html
> formatting gets removed leaving a mess that is not always readable.
>
> B) Most frequent users of R change their working directory to where their
> project files are before they start R. If you are using RStudio, the use of
> Projects will take care of this for you. Then you don't have to put in your
> whole working path in the script and you can copy/move your R and data
> files elsewhere without breaking everything.
> --
> Sent from my phone. Please excuse my brevity.
>
> On June 17, 2017 7:26:42 AM PDT, Mogjib Salek  wrote:
> >Hi all,
> >
> >I am learning R by "doing". And this is my first post.
> >
> >I want to use R: 1- to fetch a DNA sequence from a databank (see
> >bellow)
> >and 2- store it as FASTA file.
> >
> >The problem: neither an error is prompted nor the fasta file is
> >created.
> >Testing the code (see bellow), I notice that everything works until
> >the *"write.dna"
> >*command - which is not creating the fasta file.
> >
> >Here is my code:
> >
> >Get gene sequence from GenBank and store it as fasta file
> >16 June 2017
> >
> >#1- Set the working directory and make sure the right libraries are
> >installed
> >(make sure 'ape' and 'seqinr' packages are installed)
> >
> >WD <- "~Documents/Scripting/R_Studio/Sequences/"
> >setwd <- (WD)
> >
> >#2- Fetch a sequence ( bellow, "enter  manually the desired DNA ID")
> >from
> >GenBank and store it as fasta file.
> >
> >  DNAid <- "JF806202"
> >
> ># Store the sequence in lst (a list)
> >  lst <- read.GenBank(DNAid, as.character = T)
> >
> >  # convert the sequence to fasta format
> >  write.dna (lst, file = "DNAseq.fasta", format = "fasta", append =
> >FALSE,
> >   nbcol= 6, colsep= " ", colw= 10)
> >
> >
> >Any help will be appreciated.
> >Thank you.
> >
> >Kelas
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.dna command

2017-06-19 Thread William Michels via R-help
Hi Mogjib,

Does the following solve your issue?

> setwd(WD)



On Sat, Jun 17, 2017 at 7:26 AM, Mogjib Salek  wrote:

> Hi all,
>
> I am learning R by "doing". And this is my first post.
>
> I want to use R: 1- to fetch a DNA sequence from a databank (see bellow)
> and 2- store it as FASTA file.
>
> The problem: neither an error is prompted nor the fasta file is created.
> Testing the code (see bellow), I notice that everything works until
> the *"write.dna"
> *command - which is not creating the fasta file.
>
> Here is my code:
>
> Get gene sequence from GenBank and store it as fasta file
> 16 June 2017
>
> #1- Set the working directory and make sure the right libraries are
> installed
> (make sure 'ape' and 'seqinr' packages are installed)
>
> WD <- "~Documents/Scripting/R_Studio/Sequences/"
> setwd <- (WD)
>
> #2- Fetch a sequence ( bellow, "enter  manually the desired DNA ID") from
> GenBank and store it as fasta file.
>
>   DNAid <- "JF806202"
>
> # Store the sequence in lst (a list)
>   lst <- read.GenBank(DNAid, as.character = T)
>
>   # convert the sequence to fasta format
>write.dna (lst, file = "DNAseq.fasta", format = "fasta", append =
> FALSE,
>nbcol= 6, colsep= " ", colw= 10)
>
>
> Any help will be appreciated.
> Thank you.
>
> Kelas
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs: adjusting margins and labeling axes

2016-07-20 Thread William Michels via R-help
Hi Michael, is this the direction you'd like to go (simplified)?

?pairs
pairs(iris, log="xy", asp=1, gap=0.1)

--Bill.


On Tue, Jul 19, 2016 at 2:37 PM, Michael Young  wrote:

> I want to make this as easy as possible.  The extra space could just go
> around the plot in the margin area.  I could then use a cropping tool to
> paste the plot into Excel or Word.
>
> I'm not opposed to using another package, but I'd need some kind of
> pre-existing code to tinker with.
>
> On Tue, Jul 19, 2016 at 11:16 AM, Greg Snow <538...@gmail.com> wrote:
>
> > If you want square plots on a rectangular plotting region, then where
> > do you want the extra space to go?
> >
> > One option would be to add outer margins to use up the extra space.
> > The calculations to figure out exactly how much space to put in the
> > outer margins will probably not be trivial.
> >
> > Another option would be to not use `pairs`, but use the `layout`
> > function directly and loops to do your plots (and use the `respect`
> > argument to `layout`).
> >
> > On Tue, Jul 19, 2016 at 11:29 AM, michael young
> >  wrote:
> > > The default shape for this correlation scatterplot is rectangle.  I
> > changed
> > > it to square, but then the x-axis spacing between squares are off.  Is
> > > there an easy way to change x-axis spacing between squares to that of
> the
> > > y-axis spacing size?
> > >
> > > I decided to hide the name values of the diagonal squares.  I want them
> > > along the x and y axis instead, outside of the fixed number scale I
> have.
> > > I haven't seen any online example of 'pairs' with this and all my
> > searches
> > > have yielded nothing.  Any ideas?  Thanks
> > >
> > > par(pty="s")
> > > panel.cor <- function(x, y, digits = 2, prefix="", cex.cor, ...)
> > > {
> > > usr <- par("usr"); on.exit(par(usr))
> > > par(usr = c(0, 1, 0, 1),xlog=FALSE,ylog=FALSE)
> > > # correlation coefficient
> > > r <- cor(x, y)
> > > txt <- format(c(r, 0.123456789), digits = digits)[1]
> > > txt <- paste("r= ", txt, sep = "")
> > > if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
> > > text(0.5, 0.6, txt, cex=cex.cor * r)
> > >
> > > # p-value calculation
> > > p <- cor.test(x, y)$p.value
> > > txt2 <- format(c(p, 0.123456789), digits = digits)[1]
> > > txt2 <- paste("p= ", txt2, sep = "")
> > > if(p<0.01) txt2 <- paste("p= ", "<0.01", sep = "")
> > > text(0.5, 0.4, txt2)
> > > }
> > >
> > > pairs(iris, upper.panel = panel.cor,xlim=c(0.1,10),
> > > ylim=c(0.1,10),log="xy",text.panel = NULL,pch=".")
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > 538...@gmail.com
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

2018-11-01 Thread William Michels via R-help
Hi Bogdan,

Are you saying you want to drop columns that sum to zero? If so, I'm
not sure you've given us a good example dataframe, since all your
numeric columns give non-zero sums.

Otherwise, what you're asking for is trivial. Below is an example
dataframe ("ygene") with an example "AGA" column that gets dropped:

> xgene <- data.frame(TTT=c(0,1,0,0),
+TTA=c(0,1,1,0),
+ATA=c(1,0,0,0),
+gene=c("gene1", "gene2", "gene3", "gene4"))
>
> xgene[ , colSums(xgene[,1:3]) > 0 ]
  TTT TTA ATA  gene
1   0   0   1 gene1
2   1   1   0 gene2
3   0   1   0 gene3
4   0   0   0 gene4
>
> ygene <- data.frame(TTT=c(0,1,0,0),
+ TTA=c(0,1,1,0),
+ AGA=c(0,0,0,0),
+ gene=c("gene1", "gene2", "gene3", "gene4"))
>
> ygene[ , colSums(ygene[,1:3]) > 0 ]
  TTT TTA  gene
1   0   0 gene1
2   1   1 gene2
3   0   1 gene3
4   0   0 gene4


HTH,

Bill.

William Michels, Ph.D.


On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa  wrote:
> Dear all, please may I ask for a suggestion :
>
> considering a dataframe  that contains the numerical values for gene
> expression, for example :
>
>  x = data.frame(TTT=c(0,1,0,0),
>TTA=c(0,1,1,0),
>ATA=c(1,0,0,0),
>gene=c("gene1", "gene2", "gene3", "gene4"))
>
> how could I select only the COLUMNS where the value of a GENE (a ROW) is
> non-zero ?
>
> thank you !
>
> -- bogdan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

2018-11-01 Thread William Michels via R-help
Perhaps one of the following two methods:

> zgene = data.frame(  TTT=c(0,1,0,0),
+TTA=c(0,1,1,0),
+ ATA=c(1,0,0,0),
+  ATT=c(0,0,0,0),
+ row.names=c("gene1", "gene2", "gene3", "gene4"))
> zgene
  TTT TTA ATA ATT
gene1   0   0   1   0
gene2   1   1   0   0
gene3   0   1   0   0
gene4   0   0   0   0
>
> zgene[ , zgene[2,1:4] > 0]
  TTT TTA
gene1   0   0
gene2   1   1
gene3   0   1
gene4   0   0
>
> zgene[ , zgene[rownames(zgene) == "gene2",1:4] > 0]
  TTT TTA
gene1   0   0
gene2   1   1
gene3   0   1
gene4   0   0
>

Best Regards,

Bill.

William Michels, Ph.D.



On Thu, Nov 1, 2018 at 9:07 PM, Bogdan Tanasa  wrote:
> Dear Bill, and Bill,
>
> many thanks for taking the time to advice, and for your suggestions. I
> believe that I shall rephrase a bit my question, with a better example :
> thank you again in advance for your help.
>
> Let's assume that we start from a data frame :
>
> x = data.frame(  TTT=c(0,1,0,0),
>TTA=c(0,1,1,0),
> ATA=c(1,0,0,0),
>  ATT=c(0,0,0,0),
> row.names=c("gene1", "gene2", "gene3", "gene4"))
>
> Shall we select "gene2", at the end, we would like to have ONLY the COLUMNS,
> where "gene2" is NOT-ZERO. In other words, the output contains only the
> first 2 columns :
>
> output = data.frame(  TTT=c(0,1,0,0),
>TTA=c(0,1,1,0),
>row.names=c("gene1", "gene2", "gene3",
> "gene4"))
>
>  with much appreciation,
>
> -- bogdan
>
> On Thu, Nov 1, 2018 at 6:34 PM William Michels 
> wrote:
>>
>> Hi Bogdan,
>>
>> Are you saying you want to drop columns that sum to zero? If so, I'm
>> not sure you've given us a good example dataframe, since all your
>> numeric columns give non-zero sums.
>>
>> Otherwise, what you're asking for is trivial. Below is an example
>> dataframe ("ygene") with an example "AGA" column that gets dropped:
>>
>> > xgene <- data.frame(TTT=c(0,1,0,0),
>> +TTA=c(0,1,1,0),
>> +ATA=c(1,0,0,0),
>> +gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > xgene[ , colSums(xgene[,1:3]) > 0 ]
>>   TTT TTA ATA  gene
>> 1   0   0   1 gene1
>> 2   1   1   0 gene2
>> 3   0   1   0 gene3
>> 4   0   0   0 gene4
>> >
>> > ygene <- data.frame(TTT=c(0,1,0,0),
>> + TTA=c(0,1,1,0),
>> + AGA=c(0,0,0,0),
>> + gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > ygene[ , colSums(ygene[,1:3]) > 0 ]
>>   TTT TTA  gene
>> 1   0   0 gene1
>> 2   1   1 gene2
>> 3   0   1 gene3
>> 4   0   0 gene4
>>
>>
>> HTH,
>>
>> Bill.
>>
>> William Michels, Ph.D.
>>
>>
>> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa  wrote:
>> > Dear all, please may I ask for a suggestion :
>> >
>> > considering a dataframe  that contains the numerical values for gene
>> > expression, for example :
>> >
>> >  x = data.frame(TTT=c(0,1,0,0),
>> >TTA=c(0,1,1,0),
>> >ATA=c(1,0,0,0),
>> >gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > how could I select only the COLUMNS where the value of a GENE (a ROW) is
>> > non-zero ?
>> >
>> > thank you !
>> >
>> > -- bogdan
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loading the xlsx library

2019-01-14 Thread William Michels via R-help
Hello Bernard,

You might consider using the "readxl" package, which (from the package
description), "Works on Windows, Mac and Linux without external
dependencies."

 https://CRAN.R-project.org/package=readxl

HTH, Bill.

William Michels, Ph.D.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a mean line plot

2019-04-14 Thread William Michels via R-help
So you're saying rowMeans(cbind(matrix_a, matrix_b)) worked to obtain
your X-axis values?

Wild guess here, are you simply looking for:
colMeans(rbind(matrix_a, matrix_b)) to obtain your Y-axis values?

[Above assuming matrix_a and matrix_b have identical dimensions (nrow, ncol)].

--Bill

William Michels, Ph.D.


On Fri, Apr 12, 2019 at 11:09 AM rain1290--- via R-help






 wrote:
>
> Hi Eric,
>
> Ah, I apologize, and thank you for your response!
> I just figured out a way to average my x-values, so at least that is solved. 
> I will still include the data for the two variables (1-dimensional) of 
> interest that I was trying to average, just to show what was done:
> get2.teratons #(90 values)
> get5.teratons #(90 values)
> Here is what get2.teratons looks like (same idea for get5.teratons):
> >print(get2.teratons)
> [1] 0.4558545 0.4651129 0.4747509 0.4848242 0.4950900 0.5056109 0.5159335
> 0.5262532 0.5372275 0.5481839 0.5586787 0.5694379 0.5802970
> [14] 0.5909211 0.6015753 0.6124256 0.6237733 0.6353634 0.6467227 0.6582857
> 0.6702509 0.6817027 0.6935311 0.7060161 0.7182312 0.7301909
> [27] 0.7422574 0.7544744 0.7665907 0.7786409 0.7907518 0.8032732 0.8158733
> 0.8284363 0.8413905 0.8545881 0.8674711 0.8797701 0.8927392
> [40] 0.9059937 0.9189707 0.9317215 0.9438155 0.9558035 0.9673665 0.9784927
> 0.9900898 1.0020388 1.0132683 1.0240023 1.0347708 1.0456077
> [53] 1.0570347 1.0682903 1.0793535 1.0901511 1.1001753 1.1101276 1.1199142
> 1.1293237 1.1384669 1.1470002 1.1547341 1.1622488 1.1697549
> [66] 1.1777542 1.1857587 1.1930233 1.1999645 1.2067172 1.2132979 1.2199317
> 1.2265673 1.2328599 1.2390689 1.2446050 1.2495579 1.2546455
> [79] 1.2599212 1.2648733 1.2700068 1.2753889 1.2807509 1.2856922 1.2905927
> 1.2953338 1.3000484 1.3045992 1.3091128 1.3144190
> The following worked in terms of averaging all of the elements of 
> get2.teratons and get5.teratons:
> rowMeans(cbind(get2.teratons,get5.teratons))
> However, I am trying to do something similar for the values on my y-axis. So, 
> for now, here are the two variables (3-dimensional) that I would like to 
> average:
> subset
> subset5
> Using the print function for "subset" (same idea for subset5):
> >print(subset)
> class   : RasterStack
> dimensions  : 64, 128, 8192, 90  (nrow, ncol, ncell, nlayers)
> resolution  : 2.8125, 2.789327  (x, y)
> extent  : -181.4062, 178.5938, -89.25846, 89.25846  (xmin, xmax, ymin,
> ymax)
> coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
> names   : X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14,
> X15, ... >dim(subset)
> [1]  64 128  90>dim(subset5)
> [1]  64 128  90
> I tried `mean(subset,subset5)`, which works, BUT it combines the 90 layers 
> into 1 layer. I want keep the number of layers at 90, but simply average each 
> of the grid cell values of "subset" and "subset5" for each layer. So, for 
> instance, I want to average the values of each grid cell of layer 1 of 
> "subset" with the values of each grid cell of layer 1 of "subset5", and then 
> average those values of layer 2 of "subset" with those values of layer 2 of 
> "subset5"..all the way to layer 90. That way, I have 90 averages across 
> all grid cells.
> Here is what the data looks like for "subset":
> >dput(head(subset,5))
> structure(c(11.5447145886719, 11.2479725852609, 10.0223480723798,
> 11.4909216295928, 12.5930442474782, 15.0295264553279, 14.6107862703502,
> 13.3623332250863, 10.4473929153755, 13.262210553512, 13.3166334126145,
> 13.7211008928716, 10.594900790602, 11.7217378690839, 10.8397546224296,
> 14.2727348953485, 13.6185416020453, 12.7485566306859, 11.7246472276747,
> 10.6815265025944, 13.1605062168092, 12.9131189547479, 12.6493454910815,
> 11.6938022430986, 11.4522186107934, 8.84930260945112, 11.5785481408238,
> 12.9859233275056, 13.6702361516654, 11.863912967965, 11.6624090820551,
> 12.1465771459043, 12.9789240192622, 13.5916746687144, 15.0383287109435,
> 7.89674604311585, 8.14079332631081, 7.05628590658307, 6.99759456329048,
> 8.06435288395733, 8.00622920505702, 7.35754533670843, 6.57949370797724,
> 6.26998774241656, 6.10911303665489, 10.1576759945601, 9.83650996349752,
> 10.6277788057923, 10.3647025069222, 9.38627037685364, 28.411143925041,
> 27.3436004295945, 25.7670222781599, 24.1854049265385, 22.7183715440333,
> 10.8529561199248, 11.1584928352386, 11.4545458462089, 11.7570801638067,
> 11.6314635146409, 13.7268429156393, 12.4547378160059, 12.8433785866946,
> 10.282119596377, 9.66278391424567, 6.39572446234524, 8.4569685626775,
> 12.253624945879, 12.4784250743687, 13.6823802720755, 8.65540341474116,
> 8.34308553021401, 8.30261853989214, 7.9798299819231, 7.96007991302758,
> 13.3976918645203, 15.2056947816163, 15.3097502421588, 18.0296610575169,
> 17.918016621843, 14.121591579169, 14.30915

Re: [R] Split Strings

2016-01-17 Thread William Michels via R-help
> str_1 <- list("pc_m2_45_ssp3_wheat", "pc_m2_45_ssp3_wheat", "ssp3_maize", 
> "m2_wheat")
> str_2 <- strsplit(unlist(str_1), "_")
> max.length <- max(sapply(str_2,length))
> str_3 <- lapply(lapply(str_2, unlist), "length<-", max.length)
> str_3

See: 
http://stackoverflow.com/questions/27995639/i-have-a-numeric-list-where-id-like-to-add-0-or-na-to-extend-the-length-of-the

Hope this helps,

Bill

William Michels, Ph.D.


On Sun, Jan 17, 2016 at 12:56 PM, Miluji Sb  wrote:
> I have a list of strings of different lengths and would like to split each
> string by underscore "_"
>
> pc_m2_45_ssp3_wheat
> pc_m2_45_ssp3_wheat
> ssp3_maize
> m2_wheat
>
> I would like to separate each part of the string into different columns
> such as
>
> pc m2 45 ssp3 wheat
>
> But because of the different lengths - I would like NA in the columns for
> the variables have fewer parts such as
>
> NA NA NA m2 wheat
>
> I have tried unlist(strsplit(x, "_")) to split, it works for one variable
> but not for the list - gives me "non-character argument" error. I would
> highly appreciate any help. Thank you!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R editor for Mac

2016-01-20 Thread William Michels via R-help
Hello Christofer!

For text-editing the R.app GUI has always been fabulous. An old
mainstay on the Mac (after Apple's TextEdit) has been TextWrangler,
and its big-brother, BBEdit. For development RStudio is quite nice,
and--based partly on RStudio's offering of a Vim-compatibility
mode--Vim has become a recent interest, although it has a steep
learning curve. Of course, Vim itself is already available at the
Terminal command line, but more Mac-like version named MacVim is
available at:

https://github.com/b4winckler/macvim/releases

Further instructions on using Vim with R are available here:
http://manuals.bioinformatics.ucr.edu/home/programming-in-r/vim-r

BTW, it looks like you're using an older version of Mac OS X (Lion,
version 10.7.5) released in 2012. So some of the text editors
mentioned by others in this thread may not be their "latest and
greatest." Also, while you can run R version 3.2.1 right now (current
R version is 3.2.3), the R Mac page says "NOTE: the binary support for
OS X before Mavericks (10.9) is being phased out, we do not expect
further releases!" See:

https://cran.r-project.org/bin/macosx/

Mac OS X is now at version 10.11.4 (El Capitan). Wikipedia says some
Macs all the way back to 2007 can run El Capitan (see
https://en.wikipedia.org/wiki/OS_X_El_Capitan), so there may be an
upgrade path for you, if not all the way to El Capitan (10.11) then
maybe up to Mavericks (10.9), to keep you from having to compile
future R-versions from source. Finally, you should consider checking
out the R-Sig-Mac mailing list for further Mac-specific info:

https://stat.ethz.ch/mailman/listinfo/r-sig-mac

HTH,

Bill

W. Michels, Ph.D.


On Wed, Jan 20, 2016 at 10:22 AM, Christofer Bogaso
 wrote:

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R editor for Mac

2016-01-21 Thread William Michels via R-help
Run Atom with the language-r and r-exec packages:

"A language description and snippets for R"
https://atom.io/packages/language-r

"Send R code to various consoles"
https://atom.io/packages/r-exec


On Thu, Jan 21, 2016 at 9:54 AM, boB Rudis  wrote:
> Here you go Ista: https://atom.io/packages/repl (Atom rly isn't bad
> for general purpose data sci needs, I still think RStudio is the best
> environment for working with R projects).
>
> On Thu, Jan 21, 2016 at 12:48 PM, Ista Zahn  wrote:
>> On Jan 21, 2016 12:01 PM, "Philippe Massicotte" 
>> wrote:
>>>
>>> On 01/20/2016 07:22 PM, Christofer Bogaso wrote:

 Hi,

 Could you please suggest a good R editor for Mac OS X (10.7.5)
 Previously my operating system was Windows and there I used Notepad++,
 I really had very nice experience with it. However I dont see any Mac
 version is available for Mac.

 Appreciate your positive feedback.

 Thanks and regards,

>>> Atom seems to be a good choice also.
>>
>> Is it? Which package(s) should I install to write and run R code in Atom?
>> Certainly I don't see anything useful out of the box.
>>
>> Best,
>> Ista
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Documentation: Was -- identical() versus sapply()

2016-04-12 Thread William Michels via R-help
On Tue, Apr 12, 2016 at 9:44 AM, David Winsemius  wrote:

>
>  There need to be more worked examples, but those could easily be mined from 
> problems submitted as recorded in the R-help Archives and StackOverFlow.
>


This sounds like a great opportunity for R-users to contribute to the
community (and I certainly would love to participate).

One question for R-Core gurus: R-GUIs have the ability to open a
script window and use a shortcut to execute code in the R-Console. Can
each "Example" on the help pages be configured to do the same? Or at
least assist in block-copying to the Console?

We'd get a lot more people working through examples that way, and
contributors might come up with their own examples to illustrate a
particular function. A Dokuwiki site might be a place where people
could post and vote on new examples to be included in pre-existing
documentation.

--Bill
William Michels, Ph.D.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue replacing dataset values from read data

2016-05-07 Thread William Michels via R-help
1. It's not immediately clear why you need the line "temp <- subset(df, id
== myid)"

2. The objects described by "temp$age", temp$agesmoke, and temp$yrsquit are
all vectors. So temp.yrssmoke is also a vector. This means that when you
replace, it should be with "<- temp.yrssmoke[i]", where "i" is the (row)
 number you're looping over (note "temp" re-numbers rows to 1 through 6,
another reason to remove the "temp" line).

3. Ditto for " <- (temp$cigsdaytotal[i]/20)*(temp.yrssmoke[i]) "

Hope this helps!

Bill

W. Michels, Ph.D.



On Fri, May 6, 2016 at 3:19 PM, Chang, Emily  wrote:

> Dear all,
>
> I am reading a modest dataset (2297 x 644) with specific values I want to
> change. The code is inelegant but looks like this:
>
> df <- read.csv("mydata.csv", header = TRUE, stringsAsFactors = FALSE)
>
> # yrsquit, packyrs missing for following IDs. Manually change.
> for(myid in c(2165, 2534, 2553, 2611, 2983, 3233)){
>  temp <- subset(df, id == myid)
>  df[df$id == myid , "yrsquit"] <- 0
>  temp.yrssmoke <- temp$age-(temp$agesmoke+temp$yrsquit)
>  df[df$id == myid , "yrssmoke"]  <- temp.yrssmoke
>  df[df$id == myid , "packyrs"] <-
> (temp$cigsdaytotal/20)*(temp.yrssmoke)
> }
>
> If I run just the first line and then the for loop, it works.
> If I run the first line and for loop together, yrsquit is properly
> replaced to == 0, but packyrs is NA still.
>
> Obviously there's many ways around this specific problem, but I was
> wondering what the issue is here, so as to look out for and avoid it in the
> future.
>
> Apologies for the lack of reproducible code; I haven't yet reproduced the
> problem with generated data.
>
> Much thanks in advance.
>
> Best regards,
> Emily
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging two columns of unequal length

2016-12-13 Thread William Michels via R-help
You should review "The Recycling Rule in R" before attempting to
perform functions on 2 or more vectors of unequal lengths:

https://cran.r-project.org/doc/manuals/R-intro.html#The-recycling-rule

Most often, the "Recycling Rule" does exactly what the researcher
intends (automatically). And in many cases, performing functions on
data of unequal (or not evenly divisible) lengths is either 1) an
indication of problems with the input data, or 2) an indication that
the researcher is unnecessarily 'forcing' data into a rectangular data
structure, when another approach might be better (e.g. the use of the
tapply function).

However, if you see no other way, the functions "cbind.na" and/or
"rbind.na" available from Andrej-Nikolai Spiess perform binding of
vectors without recycling:

http://www.dr-spiess.de/Rscripts.html

All you have to do is download and source the correct R-script, and
call the function:

> cbind(1:5, 1:2)
 [,1] [,2]
[1,]11
[2,]22
[3,]31
[4,]42
[5,]51

Warning message:
In cbind(1:5, 1:2) :
  number of rows of result is not a multiple of vector length (arg 2)

> source("/Users/myhomedirectory/Downloads/cbind.na.R")
> cbind.na(1:5, 1:2)
 [,1] [,2]
[1,]11
[2,]22
[3,]3   NA
[4,]4   NA
[5,]5   NA
>

This issue arises so often, Dr. Spiess's two scripts "rbind.na" and
"cbind.na" have my vote for inclusion into the base-R distribution.

Best of luck,

W Michels, Ph.D.


On Mon, Dec 12, 2016 at 3:41 PM, Bailey Hewitt  wrote:
>
> Dear R Help,
>
>
> I am trying to put together two columns of unequal length in a data frame. 
> Unfortunately, so far I have been unsuccessful in the functions I have tried 
> (such as cbind). The code I am currently using is : (I have highlighted the 
> code that is not working)
>
>
> y<- mydata[,2:75]
>
> year <- mydata$Year
>
> res <- data.frame()
>
> for (i in 1:74){
>
>   y.val <- y[,i]
>
>   lake.lm= lm(y.val ~ year)
>
>   lake.res=residuals(lake.lm)
>
>   new.res <- data.frame(lake.res=lake.res)
>
>   colnames(new.res) <- colnames(y)[i]
>
> #cbind doesn't work because of the unequal lengths of my data columns
>
>   res <- cbind(res, new.res)
>
>   print(res)
>
> }
>
>
> mydata is a csv file with "Year" from 1950 on as my first column and then 
> each proceeding column has a lake name and a day of year (single number) in 
> each row.
>
>
> Please let me know if there is any more information I can provide as I am new 
> to emailing in this list. Thank you for your time!
>
>
> Bailey Hewitt
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating possible cominations of a vector's elements

2016-12-22 Thread William Michels via R-help
Hi Dmitri,

> hoyt <- unlist(strsplit("how are you today", split="\\s"))
> y <- list()
> for(j in seq_along(hoyt))  y[[j]] <- sapply(combn(length(hoyt), j, 
> simplify=F, function(i) hoyt[i]), paste, collapse = " ")

> y

[[1]]
[1] "how"   "are"   "you"   "today"
[[2]]
[1] "how are"   "how you"   "how today" "are you"   "are today" "you today"
[[3]]
[1] "how are you"   "how are today" "how you today" "are you today"
[[4]]
[1] "how are you today"

>

It was unclear if you wanted combinations (per your subject line), or
consecutive-word substrings (per your example). The code above returns
combinations.

If you actually want a third output--permutations--you'll have to look
at the permn() function in the "combinat" package, authored by Scott
Chasalow and maintained by Vince Carey.

Cheers,

Bill
William Michels, Ph.D.


On Fri, Dec 9, 2016 at 7:52 AM, Dimitri Liakhovitski
 wrote:
> Thanks a lot, David and Bill!
>
>
> On Thu, Dec 8, 2016 at 8:16 PM, David L Carlson  wrote:
>> Not my day. Another correction:
>>
>> makestrings <- function(vec) {
>>  len <- length(vec)
>>  idx <- expand.grid(1:len, 1:len)
>>  idx <- idx[idx$Var2 <= idx$Var1, c("Var2", "Var1")]
>>  mapply(function(x, y) paste(vec[x:y], collapse=" "),
>>   x=idx[, 1], y=idx[, 2])
>> }
>>
>> David C
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David L 
>> Carlson
>> Sent: Thursday, December 8, 2016 7:12 PM
>> To: Dimitri Liakhovitski ; r-help 
>> 
>> Subject: Re: [R] creating possible cominations of a vector's elements
>>
>> This corrects an error in my earlier function definition:
>>
>> makestrings <- function(vec) {
>>  len <- length(mystring.spl)
>>  idx <- expand.grid(1:len, 1:len)
>>  idx <- idx[idx$Var2 <= idx$Var1, c("Var2", "Var1")]
>>  mapply(function(x, y) paste(vec[x:y], collapse=" "),
>>   x=idx[, 1], y=idx[, 2])
>> }
>>
>> David C
>>
>> -Original Message-
>> From: David L Carlson
>> Sent: Thursday, December 8, 2016 5:51 PM
>> To: 'Dimitri Liakhovitski' ; r-help 
>> 
>> Subject: RE: [R] creating possible cominations of a vector's elements
>>
>> You can use expand.grid() and mapply():
>>
>> mystring <- "this is my vector"
>> mystring.spl <- strsplit(mystring, " ")[[1]]
>>
>> makestrings <- function(x) {
>>  len <- length(mystring.spl)
>>  idx <- expand.grid(1:len, 1:len)
>>  idx <- idx[idx$Var2 <= idx$Var1, c("Var2", "Var1")]
>>  mapply(function(x, y) paste(mystring.spl[x:y], collapse=" "),
>>   x=idx[, 1], y=idx[, 2])
>> }
>> makestrings(mystring.spl)
>>
>>  [1] "this"  "this is"   "this is my"
>>  [4] "this is my vector" "is""is my"
>>  [7] "is my vector"  "my""my vector"
>> [10] "vector"
>>
>> This makes a vector of strings but if you want a list use as.list(mapply())
>>
>> David L. Carlson
>> Department of Anthropology
>> Texas A&M University
>>
>>
>>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Dimitri 
>> Liakhovitski
>> Sent: Thursday, December 8, 2016 5:03 PM
>> To: r-help 
>> Subject: [R] creating possible cominations of a vector's elements
>>
>> Hello!
>>
>> I have a vector of strings 'x' that was based on a longer string
>> 'mystring' (the actual length of x is unknown).
>>
>> mystring <- "this is my vector"
>> x <- strsplit(mystr, " ")[[1]]
>>
>> I am looking for an elegant way of creating an object (e.g., a list)
>> that contains the following strings:
>>
>> "this"
>> "this is"
>> "this is my"
>> "this is my vector"
>> "is"
>> "is my"
>> "is my vector"
>> "my"
>> "my vector"
>> "vector"
>>
>> Thanks a lot!
>>
>> --
>> Dimitri
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Dimitri Liakhovitski
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible 

Re: [R] \n and italic() in legend()

2016-12-29 Thread William Michels via R-help
Hi Marc,

I can't seem to get "\n" to work,  but simply using c() and "y.intersp
= 1" looks fine:

> plot(1, 1)
> v1 <- c(expression(italic("p")*"-value"), expression("based on 
> "*italic("t")*"-test"))
> legend("topright", legend=v1, y.intersp = 1, bty="n")



Hope this helps,

Bill

William Michels, Ph.D.

On Thu, Dec 29, 2016 at 1:35 AM, Marc Girondot via R-help
 wrote:
> Hi everyone,
>
> Could someone help me to get both \n (return) and italic() in a legend. Here
> is a little example showing what I would like (but without the italic) and
> second what I get:
>
> plot(1, 1)
> v1 <- "p-value\nbased on t-test"
> legend("topright", legend=v1, y.intersp = 3, bty="n")
>
> plot(1, 1)
> v1 <- expression(italic("p")*"-value\nbased on "*italic("t")*"-test")
> legend("topright", legend=v1, y.intersp = 3, bty="n")
>
> The second one shows :
>
> -value
> pbased on t-test
>
> rather than the expected:
>
> p-value
> based on t-test
>
> Thanks a lot,
>
> Marc
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Export R output in Excel

2016-12-29 Thread William Michels via R-help
Hi Bryan (and Petr),

If you want to write tsv-style data from R to clipboard on a Mac (e.g.
for pasting into Numbers), you should do:

> x1 <- matrix(1:6, nrow =2)

> clip <- pipe("pbcopy", "w")
> write.table(x1, file=clip, sep = "\t", row.names = FALSE, fileEncoding = 
> "UTF-8" )
> close(clip)
> gc()

> ?write.table
> ?connections

Adding an extra call to gc() (garbage collection) after writing to
clipboard will
close all unused connections (useful if a connection has been entered
incorrectly).

HTH,

Bill
William Michels, Ph.D.

http://stackoverflow.com/questions/14547069/how-to-write-from-r-to-the-clipboard-on-a-mac
http://stackoverflow.com/questions/30445875/what-exactly-is-a-connection-in-r


On Wed, Dec 28, 2016 at 11:14 PM, PIKAL Petr  wrote:
> Hi
>
> For rectangular data
>
> write.table(tab, "clipboard", sep = "\t", row.names = F)
> followed by Ctrl-V in Excel
>
> or
> write.table(tab, "somefile.xls", sep = "\t", row.names = F)
>
> For free format output like summary(somefit) I prefer to copy it to Word and 
> use font like  Courier New with monospaced letters
>
> Cheers
> Petr
>
>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bryan Mac
>> Sent: Wednesday, December 28, 2016 10:45 PM
>> To: R-help@r-project.org
>> Subject: [R] Export R output in Excel
>>
>> Hi,
>>
>> How do I export results from R to Excel in a format-friendly way? For
>> example, when I copy and paste my results into excel, the formatting is
>> messed up.
>>
>> Thanks.
>>
>> Bryan Mac
>> bryanmac...@gmail.com
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
> svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
> zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
> a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
> dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
> žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
> pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu 
> případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je 
> adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are 
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. 
> Delete the contents of this e-mail with all attachments and its copies from 
> your system.
> If you are not the intended recipient of this e-mail, you are not authorized 
> to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage caused 
> by modifications of the e-mail or by delay with transfer of the email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a 
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to immediately 
> accept such offer; The sender of this e-mail (offer) excludes any acceptance 
> of the offer on the part of the recipient containing any amendment or 
> variation.
> - the sender insists on that the respective contract is concluded only upon 
> an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter 
> into any contracts on behalf of the company except for cases in which he/she 
> is expressly authorized to do so in writing, and such authorization or power 
> of attorney is submitted to the recipient or the person represented by the 
> recipient, or the existence of such authorization is known to the recipient 
> of the person represented by the recipient.
> __
> R-he

Re: [R] \n and italic() in legend()

2016-12-29 Thread William Michels via R-help
Hi Marc, I think it would be wrong to leave readers with the
impression that it's somehow improper to use c() in drawing a legend,
because in fact, it works so well. What doesn't work so well is mixing
expression() calls with escaped characters like "\n" (or "\r"), and
that's probably due to expression() using plotmath() and
as.graphicsAnnot() to draw text.

Maybe the take-home lesson is to not mix expression() and escaped
characters in a legend? If no expression() call is present, "\n" works
fine:

## legend: two lines per variable, no expression() call
plot(1, 1)
v1 <- c("some great text\nhere")
v2 <- c("some more great\ntext here")
legend("topright", legend=c(v1, v2), y.intersp = 1.5, bty="n",
lty=c(1, 1), lwd = c(2, 2), col=c("black", "red"))

If an expression() is present, every time legend() encounters a new
line (via either a compound expression, or via "\n"), it treats it as
a location to display a new variable. However, taking advantage of
plotmath(), you can simply use scriptstyle() or even
scriptscriptstyle(), to draw smaller text on one line:

## legend: expression() call /w single line per variable
plot(1, 1)
v1 <- expression(italic("p")*"-value based on "*italic("t")*"-test")
v2 <- expression(italic("w")*"-value for A and B identical models")
legend("topright", legend=c(v1, v2), y.intersp = 1.0, bty="n",
lty=c(1, 1), lwd = c(2,2), col=c("black", "red"))

## legend: expression() call /w two lines per variable
## (note lty, lwd, and col correction)
plot(1, 1)
v1 <- expression(italic("p")*"-value", "based on "*italic("t")*"-test")
v2 <- expression(italic("w")*"-value", "for A and B identical models")
legend("topright", legend=c(v1, v2), y.intersp = 1.0, bty="n",
lty=c(1, 0, 1, 0), lwd = c(2, 0, 2, 0), col=c("black", "", "red", ""))

## legend: expression() call /w single line per variable,
## smaller script
plot(1, 1)
v1 <- expression(scriptstyle(bold(italic("p")*"-value based on
"*italic("t")*"-test")))
v2 <- expression(scriptstyle(bold(italic("w")*"-value for A and B
identical models")))
legend("topright", legend=c(v1,v2), y.intersp = 1.0, bty="n", lty=c(1,
1), lwd = c(2,2), col=c("black", "red"))

## legend: expression() call /w single line per variable,
## even smaller script
plot(1, 1)
v1 <- expression(scriptscriptstyle(bold(italic("p")*"-value based on
"*italic("t")*"-test")))
v2 <- expression(scriptscriptstyle(bold(italic("w")*"-value for A and
B identical models")))
legend("topright", legend=c(v1,v2), y.intersp = 1.0, bty="n", lty=c(1,
1), lwd = c(2,2), col=c("black", "red"))

I'm loathe to call your initial finding of  'an italicized character
jumping to the second line when used in conjunction with "\n" ' as a
bug, but maybe others can chime in as to why that happens.

HTH,

Bill
William Michels, Ph.D.



On Thu, Dec 29, 2016 at 1:45 PM, Marc Girondot  wrote:
> Hi,
> Thanks a lot to Duncan Mackay for the trick using atop() [but the legends
> are centered and not left aligned] and also for the suggestion of William
> Michels to use simply ",". However this last solution prevents to use
> several legends.
>
> Here is a solution to allow both return within a legend and several legends:
> plot(1, 1)
> v1 <- c(expression(italic("p")*"-value"), expression("based on
> "*italic("t")*"-test"))
> v2 <- c(expression(italic("w")*"-value for A"), expression("and B identical
> models"))
> legend("topright", legend=c(v1, v2), lty=c(1, 0, 1, 0), y.intersp = 1,
> bty="n", col=c("black", "", "red", ""))
>
> Thanks again
>
> Marc
>
>
> Le 29/12/2016 à 10:54, Duncan Mackay a écrit :
>>
>> Hi Marc
>>
>> Try atop
>>
>> plot(1, 1)
>> v1 <- expression(atop(italic("p")*"-value","based on
>> "*italic("t")*"-test"))
>> legend("topright", legend=v1, y.intersp = 3, bty="n")
>>
>>
>> Regards
>>
>> Duncan
>>
>> Duncan Mackay
>> Department of Agronomy and Soil Science
>> University of New England
>> Armidale NSW 2351
>> Email: home: mac...@northnet.com.au
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Marc
>> Girondot via R-help
>> Sent: Thursday, 29 December 2016 20:35
>> To: R-help Mailing List
>> Subject: [R] \n and italic() in legend()
>>
>> Hi everyone,
>>
>> Could someone help me to get both \n (return) and italic() in a legend.
>> Here is a little example showing what I would like (but without the
>> italic) and second what I get:
>>
>> plot(1, 1)
>> v1 <- "p-value\nbased on t-test"
>> legend("topright", legend=v1, y.intersp = 3, bty="n")
>>
>> plot(1, 1)
>> v1 <- expression(italic("p")*"-value\nbased on "*italic("t")*"-test")
>> legend("topright", legend=v1, y.intersp = 3, bty="n")
>>
>> The second one shows :
>>
>> -value
>> pbased on t-test
>>
>> rather than the expected:
>>
>> p-value
>> based on t-test
>>
>> Thanks a lot,
>>
>> Marc
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-proje

Re: [R] [FORGED] function for remove white space

2017-02-21 Thread William Michels via R-help
Hi José (and Rolf),

It's not entirely clear what type of 'whitespace' you're referring to,
but if you're using read.table() or read.csv() to create your
dataframe in the first place, setting 'strip.white = TRUE' will remove
leading and trailing whitespace 'from unquoted character fields
(numeric fields are always stripped).'

> ?read.table
>?read.csv

Cheers,

Bill


On 2/21/17, Rolf Turner  wrote:
> On 22/02/17 12:51, José Luis Aguilar wrote:
>> Hi all,
>>
>> i have a dataframe with 34 columns and 1534 observations.
>>
>> In some columns I have strings with spaces, i want remove the space.
>> Is there a function that removes whitespace from the entire dataframe?
>> I use gsub but I would need some function to automate this.
>
> Something like
>
> X <- as.data.frame(lapply(X,function(x){gsub(" ","",x)}))
>
> Untested, since you provide no reproducible example (despite being told
> by the posting guide to do so).
>
> I do not know what my idea will do to numeric columns or to factors.
>
> However it should give you at least a start.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to select one value per row (different columns) from array

2017-03-01 Thread William Michels via R-help
Hello Wolfgang,

Building on Peter Dalgaard's code, are you just trying to take a sample of
a random column from each row? You don't need to use apply:

> array[cbind(1:nrow(array), sample.int(ncol(array), nrow(array),
replace=TRUE ))]

Just a general note, since you're sampling one-column-per-row from an array
with more rows than columns, you'll have to set replace=TRUE. However,
there may be other datasets where you have more columns than rows and never
want to sample each column more than once, in which case you would set
replace=FALSE.

Best Regards.



On Wed, Mar 1, 2017 at 5:38 AM, peter dalgaard  wrote:

>
> array[cbind(1:999,vector)]
>
> -pd
>
> On 01 Mar 2017, at 14:28 , Wolfgang Waser 
> wrote:
>
> > Dear all,
> >
> > I have to pick one value per row from an array, but from row to row from
> > a different column. The column positions of the values for each row are
> > stored in a vector.
> >
> > array: 999 rows, 48 columns
> >
> > vector: 999 values (each between 1 and 48) indicating for each row which
> > value to pick from that row.
> >
> > Is there a non-loop way to pick the 999 values from the array, probably
> > using some form of ?apply?
> >
> >
> > Thank you very much for help and suggestions!
> >
> > Wolfgang
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unnesting JSON using R

2020-09-17 Thread William Michels via R-help
Hi Fred, I believe the preferred package is jsonlite:

https://cran.r-project.org/package=jsonlite
https://jeroen.cran.dev/jsonlite/index.html

HTH, Bill.

W. Michels, Ph.D.

On Tue, Sep 15, 2020 at 1:48 PM Fred Kwebiha  wrote:
>
> Source=https://jsonformatter.org/e038ec
>
> The above is nested json.
>
> I want the output to be as below
> dataElements.name,dataElements.id,categoryOptionCombos.name,categoryOptionCombos.id
>
> Any help in r?
> *Best Regards,*
>
> *FRED KWEBIHA*
> *+256-782-746-154*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lahman Baseball Data Using R DBI Package

2020-10-07 Thread William Michels via R-help
Hi Philip,

You've probably realized by now that R doesn't like column names that
start with a number. If you try to access an R-dataframe column named
2B or 3B with the familiar "$" notation, you'll get an error:

> library(DBI)
> library(RSQLite)
> con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 
> AND AB >600 ORDER BY AB DESC")
> Hack12Batting$AB
 [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
> Hack12Batting$3B
Error: unexpected numeric constant in "Hack12Batting$3"

How to handle? You can rename columns on-the-fly by piping. See
reference [1] and use either library(magrittr) or library(dplyr) or a
combination thereof:

library(magrittr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))

#OR one of the following:

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))

Best, Bill.

W. Michels, Ph.D.

[1] 
https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames










On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap  wrote:
>
> The double quotes are required by SQL if a name is not of the form
> letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL
> keyword like 'where' or 'select'.  If you are doing this from a function,
> you may as well quote all the names.
>
> -Bill
>
> On Fri, Oct 2, 2020 at 6:18 PM Philip  wrote:
>
> > The \”2B\” worked.  Have no idea why.  Can you point me somewhere that can
> > explain this to me.
> >
> > Thanks,
> > Philip
> >
> > *From:* Bill Dunlap
> > *Sent:* Friday, October 2, 2020 3:54 PM
> > *To:* Philip
> > *Cc:* r-help
> > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
> >
> > Have you tried putting double quotes around 2B and 3B:  "...2B, 3B, ..."
> > -> "...\"2B\",\"3B\",..."?
> >
> > -Bill
> >
> > On Fri, Oct 2, 2020 at 3:49 PM Philip  wrote:
> >
> >> I’m trying to pull data from one table (batting) in the Lahman Baseball
> >> database.  Notice X2B for doubles and X3B for triples – fourth and fifth
> >> from the right.
> >>
> >> The dbGetQuery function runs fine when I leave there two out but I get
> >> error messages (in red) when I include 2B/3B or X2B/X3B.
> >>
> >> Can anyone give me some direction?
> >>
> >> Thanks,
> >> Philip Heinrich
> >>
> >> ***
> >> tail(dbReadTable(Lahman,"batting"))
> >>
> >> ID   playerID  yearIDstint teamID team_ID
> >> lgID   GG_batting   AB R H   X2BX3B   HR   RBI   SB
> >> 107414 107414 yastrmi01  2019   1   SFN   2920
> >> NL 107NA  371   64  101  22   3 21
> >> 552
> >> 107416 107416 yelicch01  20191   MIL   2911
> >> NL 130NA  489 100  161  29   3 4497   
> >> 30
> >> 107419 107419 youngal01 2019   1   ARI2896
> >> NL   17NA25 1  10   0
> >> 0  0 0
> >> 107420 107420 zagunma01   20191  CHN   2901  NL
> >> 30NA 36 2  93   0  0
> >> 5 0
> >> 107422 107422 zavalse01  20191  CHA   2900
> >> AL5NA 12 1  10   0
> >> 0   0 0
> >> 107427 107427 zimmery01 20191  WAS  2925  NL
> >> 52NA   171   20449  0  6  27 0
> >> 107428 107428 zobribe01   20191  CHN  2901
> >> NL  47NA   150   24   39 5  0  1
> >> 17 0
> >> 107429 107429 zuninmi01   20191  TBA   2922
> >> AL  90NA   26630  44   10  1  9
> >> 32 0
> >>
> >>
> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> >> playerID,yearID,AB,R,H,2B,3B,HR,
> >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> >> batting
> >> WHERE yearID = 2018 AND AB >99")
> >> Error: unrecognized token: "2B"
> >>
> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> >> playerID,yearID,AB,R,H,X2B,X3B,HR,
> >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> >> batting
> >> WHERE yearID = 2018 AND AB >99")
> >> Error: no such column: X2B
> >>
> >> [[alternative HTML version deleted]]
>

Re: [R] Installing Perl For Use in R

2020-10-08 Thread William Michels via R-help
Hi Philip,

"Perl Download"
https://www.perl.org/get.html

The above link gives you the option to install from source or from
ActiveState. The first link below (source) proudly proclaims, "Perl
compiles on over 100 platforms..." and the second link below (binary)
similarly proclaims, "Perl supports over 100 platforms!":

"Perl Source"
https://www.cpan.org/src/README.html

"Perl Ports (Binary Distributions)"
https://www.cpan.org/ports/index.html

HTH, Bill.

W. Michels, Ph.D.



On Tue, Oct 6, 2020 at 8:32 AM Marc Schwartz via R-help
 wrote:
>
> Hi,
>
> What OS are you on?
>
> It has been years since I used ActiveState, but it looks like you now need to 
> create an account with them prior to downloading the installation files. They 
> seem to give you the option of creating an account with them, or using 
> Github. I would opt for the former, even though I have a Github account.
>
> If you are on Windows, an alternative Perl distribution is Strawberry Perl:
>
>   http://strawberryperl.com
>
> Regards,
>
> Marc Schwartz
>
> > On Oct 6, 2020, at 11:21 AM, Philip  wrote:
> >
> > I’m getting nowhere with this.  From the website below  I clicked on the 
> > ActivePerl 5.26 button which seems to lead me into creating projects on the 
> > Cloud rather than downloading the software to my hard drive.
> >
> >https://www.activestate.com/products/perl/downloads/
> >
> > Can someone give me some advise?  What the website is trying to do seems 
> > rather shady.
> >
> > Thanks,
> > Philip
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating column differences

2021-03-24 Thread William Michels via R-help
Dear Jeff,

Rather than diff-ing a linear vector you're trying to diff values from
two different rows. Also you indicate that you want to place the
diff-ed value in the 'lower' row of a new column. Try this (note
insertion of an initial "zero" row):

> df <- data.frame(ID=1:5,Score=4*2:6)
> df1 <- rbind(c(0,0), df)
> cbind(df1, "diff"=c(0, diff(df1$Score)) )
  ID Score diff
1  0 00
2  1 88
3  2124
4  3164
5  4204
6  5244
>

HTH, Bill.

W. Michels, Ph.D.



On Wed, Mar 24, 2021 at 9:49 AM Jeff Reichman  wrote:
>
> r-help forum
>
>
>
> I'm trying to calculate the diff between two rows and them mutate the
> difference into a new column. I'm using the diff function but not giving me
> what I want.
>
>
>
> df <- data.frame(ID=1:5,Score=4*2:6)
>
>
>
> What a want  where
>
>   ID Score  diff
>
> 1  1 8  8
>
> 2  212 4
>
> 3  316 4
>
> 4  420 4
>
> 5  524 4
>
>
>
> What I am getting
>
>   ID Score  diff
>
> 1  1 8  NA
>
> 2  212 4
>
> 3  316 4
>
> 4  420 4
>
> 5  524 4
>
>
>
> Jeff
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating column differences

2021-03-24 Thread William Michels via R-help
More correctly, with an initial "NA" value in the "diff" column:

> df <- data.frame(ID=1:5,Score=4*2:6)
> df1 <- rbind(c(0,0), df)
> cbind(df1, "diff"=c(NA, diff(df1$Score)) )
  ID Score diff
1  0 0   NA
2  1 88
3  2124
4  3164
5  4204
6  5244
>

HTH, Bill.
On Wed, Mar 24, 2021 at 10:59 AM William Michels  wrote:
>
> Dear Jeff,
>
> Rather than diff-ing a linear vector you're trying to diff values from
> two different rows. Also you indicate that you want to place the
> diff-ed value in the 'lower' row of a new column. Try this (note
> insertion of an initial "zero" row):
>
> > df <- data.frame(ID=1:5,Score=4*2:6)
> > df1 <- rbind(c(0,0), df)
> > cbind(df1, "diff"=c(0, diff(df1$Score)) )
>   ID Score diff
> 1  0 00
> 2  1 88
> 3  2124
> 4  3164
> 5  4204
> 6  5244
> >
>
> HTH, Bill.
>
> W. Michels, Ph.D.
>
>
>
> On Wed, Mar 24, 2021 at 9:49 AM Jeff Reichman  wrote:
> >
> > r-help forum
> >
> >
> >
> > I'm trying to calculate the diff between two rows and them mutate the
> > difference into a new column. I'm using the diff function but not giving me
> > what I want.
> >
> >
> >
> > df <- data.frame(ID=1:5,Score=4*2:6)
> >
> >
> >
> > What a want  where
> >
> >   ID Score  diff
> >
> > 1  1 8  8
> >
> > 2  212 4
> >
> > 3  316 4
> >
> > 4  420 4
> >
> > 5  524 4
> >
> >
> >
> > What I am getting
> >
> >   ID Score  diff
> >
> > 1  1 8  NA
> >
> > 2  212 4
> >
> > 3  316 4
> >
> > 4  420 4
> >
> > 5  524 4
> >
> >
> >
> > Jeff
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stata/Rstudio evil attributes

2021-04-10 Thread William Michels via R-help
Hi Roger,

You could look at the attributes() function in base-R. See:

> ?attributes

>From the help-page:

> ## strip an object's attributes:
> attributes(x) <- NULL

HTH, Bill.

W. Michels, Ph.D.



On Sat, Apr 10, 2021 at 4:20 AM Koenker, Roger W  wrote:
>
> Wolfgang,
>
> Thanks, this is _extremely_ helpful.
>
> Roger
>
> > On Apr 10, 2021, at 11:59 AM, Viechtbauer, Wolfgang (SP) 
> >  wrote:
> >
> > Dear Roger,
> >
> > The problem is this. qss() looks like this:
> >
> > if (is.matrix(x)) {
> >   [...]
> > }
> > if (is.vector(x)) {
> >   [...]
> > }
> > qss
> >
> > Now let's check these if() statements:
> > is.vector(B$x) # TRUE
> > is.vector(D$x) # FALSE
> > is.matrix(B$x) # FALSE
> > is.matrix(D$x) # FALSE
> >
> > is.vector(D$x) being FALSE may be surprising, but see ?is.vector: 
> > "is.vector returns TRUE if x is a vector of the specified mode having no 
> > attributes other than names. It returns FALSE otherwise." And as D$x shows, 
> > this vector has additional attributes.
> >
> > So, with 'D', qss() returns the qss function (c.f., qss(B$x) and qss(D$x)) 
> > which makes no sense. So, the internal logic in qss() needs to be fixed.
> >
> >> In accordance with the usual R-help etiquette I first tried to contact the
> >> maintainer of the haven package, i.e. RStudio, which elicited the 
> >> response: "since
> >> the error is occurring outside RStudio we’re not responsible, so try Stack
> >> Overflow".  This is pretty much what I would have expected from the 
> >> capitalist
> >> running dogs they are.  Admittedly, the error is probably due to some 
> >> unforeseen
> >
> > This kind of bashing is really silly. Can you tell us again how much you 
> > paid for the use of the haven package?
> >
> > Best,
> > Wolfgang
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Koenker, 
> >> Roger W
> >> Sent: Saturday, 10 April, 2021 11:26
> >> To: r-help
> >> Subject: [R] Stata/Rstudio evil attributes
> >>
> >> As shown in the reproducible example below, I used the RStudio function 
> >> haven() to
> >> read a Stata .dta file, and then tried to do some fitting with the 
> >> resulting
> >> data.frame.  This produced an error from my fitting function rqss() in the 
> >> package
> >> quantreg.  After a bit of frustrated cursing, I converted the data.frame, 
> >> D, to a
> >> matrix A, and thence back to a data.frame B, and tried again, which worked 
> >> as
> >> expected.  The conversion removed the attributes of D.  My question is:  
> >> why were
> >> the attributes inhibiting the fitting?
> >>
> >> In accordance with the usual R-help etiquette I first tried to contact the
> >> maintainer of the haven package, i.e. RStudio, which elicited the 
> >> response: "since
> >> the error is occurring outside RStudio we’re not responsible, so try Stack
> >> Overflow".  This is pretty much what I would have expected from the 
> >> capitalist
> >> running dogs they are.  Admittedly, the error is probably due to some 
> >> unforeseen
> >> infelicity in my rqss() coding, but it does seem odd that attributes could 
> >> have
> >> such a drastic  effect.  I would be most grateful for any insight the R 
> >> commune
> >> might offer.
> >>
> >> #require(haven) # for reading dta file
> >> #Ddta <- read_dta(“foo.dta")
> >> #D <- with(Ddta, data.frame(y = access_merg, x = meannets_allhh, z = 
> >> meanhh))
> >> #save(D, file = "D.Rda")
> >> con <- url("http://www.econ.uiuc.edu/~roger/research/data/D.Rda";)
> >> load(con)
> >>
> >> # If I purge the Stata attributes in D:
> >> A <- as.matrix(D)
> >> B <- as.data.frame(A)
> >>
> >> # This works:
> >> with(D,plot(x, y, cex = .5, col = "grey"))
> >> taus <- 1:4/5
> >> require(quantreg)
> >> for(i in 1:length(taus)){
> >>   f <- rqss(y ~ qss(x, constraint = "I", lambda = 1), tau = taus[i], data 
> >> = B)
> >>   plot(f, add = TRUE, col = i)
> >> }
> >> # However, the same code with data = D, does not.  Why?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] series of densities

2021-05-17 Thread William Michels via R-help
Hi Troels,

Have you considered using Lattice graphics?
Adapting from examples on the help page:

> ?histogram()
> histogram( ~ BC | pH, data = ddd, type = "density",
   xlab  = "BC", layout = c(1, 3), aspect = 0.618,
   strip = strip.custom(strip.levels=c(TRUE,TRUE)),
   panel = function(x, ...) {
   panel.histogram(x, ...)
   panel.mathdensity(dmath = dnorm, col = 1,
   args = list(mean=mean(x), sd=sd(x)) )
   } )


HTH, Bill.

W. Michels, Ph.D.






On Mon, May 17, 2021 at 8:12 AM Troels Ring  wrote:
>
> Dear friends
> I'm trying to plot in silico derived values of 3 types of
> buffer-capacities  over pH values and want densities of the three types
> together at each pH with the pH values on the abscissa.
>
> I have generated some data
>
> set.seed(2345)
> pHs <- c(7.2,7.4,7.6)
> pH <- rep(pHs,each=30)
> BC <- rep(rep(c(20,10,10),each=10),3)+rnorm(90,0,5)
> type <- rep(rep(c("TOT","NC","CA"),each=10),3)
>
> ddd <- data.frame(pH,BC,type)
>
> GG <- ggplot()
> for (i in 1:3) {
>dd <- ddd[ddd$pH==pHs[i],]
>GG <- GG + geom_density(data=dd,aes(x=BC,fill=type),alpha=0.1)
> }
> GG
>
> but here I only get all pH values  plotted together whereas I want 3
> series in the vertical direction at the three pH values.
>
> I wonder how this could be done?
>
> All best wishes
>
> Troels Ring, MD
> Aalborg, Denmark
>
> PS: Windows 10,
>
> R version 4.0.5 (2021-03-31
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner problem - using mod function to print odd numbers

2021-06-05 Thread William Michels via R-help
> i <- 1L; span <- 1:100; result <- NA;
> for (i in span){
+ ifelse(i %% 2 != 0, result[i] <- TRUE, result[i] <- FALSE)
+ }
> span[result]
 [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
45 47 49 51 53 55 57
[30] 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99
>

HTH, Bill.

W. Michels, Ph.D.


On Sat, Jun 5, 2021 at 12:55 AM Stefan Evert  wrote:
>
> >
> > I don't understand. --
> >
> > 7%%2=1
> > 9%%2=1
> > 11%%2=1
> >
> > What aren't these numbers printing ?
> >
> > num<-0
> > for (i in 1:100){
> >  num<-num+i
> > if (num%%2 != 0)
> >  print(num)
> > }
>
> Your code tests the numbers
>
> 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, …
>
> and correctly prints the odd ones among them.
>
> But I suppose that's not what you wanted to do?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner problem - using mod function to print odd numbers

2021-06-06 Thread William Michels via R-help
Dear Bert,

First off, I want to thank you for the many hundreds (if not
thousands) of excellent posts I've read from you on this mailing list
over the years. And you are absolutely correct that when using the
`%%` modulo operator, your code is the most compact and the most
idiomatic.

That being said, if someone coming from another programming language
is posting code on this mailing list using a `for()` loop,  they may
be most comfortable getting working code back from this mailing-list
that still uses a `for()` loop in R. Furthermore, people often start
by *filtering* their data, when a better approach might be to first
*recode* it, for which the `ifelse()` function provides a nice
solution. But of course, a more simple approach than I previously
posted would be below (although less idiomatic than your answer):

> object <- 1:100
> index <- ifelse(object %% 2 == 1, TRUE, FALSE)
> object[index]
 [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
45 47 49 51 53 55 57
[30] 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99
>

This brings us to the question of programming 'philosophy':  certain
languages advise "TMTOWTDI", while another advises "There should be
one--and preferably only one--obvious way to do it." Where does R fit
on this spectrum? I believe one of R's underappreciated strengths is
being more in-line with the "TMTOWTDI" principle. So people coming
from other languages can get good results right off the bat using
vectorized indexing/filtering, or for() loops, or the apply() family
of functions, or writing a custom function (all answers given in this
thread).

Finally, if a nascent R programmer ever ventures into filtering their
data using objects and indexes of different lengths, they should have
a grasp of the code examples below (recycling rule):

long_vec <- 1:16
print(long_vec)
short_vec <- rep(4,8)
print(short_vec)
print(long_vec[long_vec > short_vec])

span <- 1:length(short_vec)
print(long_vec[span][long_vec[span] > short_vec])
print(long_vec[span][long_vec > short_vec])

Best Regards, Bill.

W. Michels, Ph.D.



On Sat, Jun 5, 2021 at 12:26 PM Bert Gunter  wrote:
>
> I'm sorry, but  this is a good example of how one should *not* do this in R. 
> I also should apologize for any pedantry that follows, but I believe this 
> serves as a nice example of the ideas.
>
> Two of R's central features as a "data science" language are that many of its 
> core capabilities are "vectorized" -- can calculate on whole objects (at the 
> user-visible interpreter level) rather than requiring explicit loops; and 
> that it can use object indexing in several different modalities, here logical 
> indexing, for extraction and replacement in whole objects such as vectors and 
> matrices. Not only does this typically yield simpler, more readable code 
> (admittedly, a subjective judgment), but it is also typically much faster, 
> though I grant you that this can often be overrated.
>
> In this instance, the several lines of looping code you presented can be 
> condensed into a single line:
>
> > span <- 1:20
> > span[span %% 2 == 1]
>  [1]  1  3  5  7  9 11 13 15 17 19
>
> ### Trickier, but perhaps instructive, is: ###
> > span[TRUE & span %% 2]
>  [1]  1  3  5  7  9 11 13 15 17 19
>
> All languages trade off various strengths and weaknesses, but I think it's 
> fair to say that one should try to work within the paradigms that are the 
> language's strengths when possible, R's vectorization and indexing in this 
> example.
>
> Cheers,
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Jun 5, 2021 at 11:05 AM William Michels via R-help 
>  wrote:
>>
>> > i <- 1L; span <- 1:100; result <- NA;
>> > for (i in span){
>> + ifelse(i %% 2 != 0, result[i] <- TRUE, result[i] <- FALSE)
>> + }
>> > span[result]
>>  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
>> 45 47 49 51 53 55 57
>> [30] 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99
>> >
>>
>> HTH, Bill.
>>
>> W. Michels, Ph.D.
>>
>>
>> On Sat, Jun 5, 2021 at 12:55 AM Stefan Evert  
>> wrote:
>> >
>> > >
>> > > I don't understand. --
>> > >
>> > > 7%%2=1
>> > > 9%%2=1
>> > > 11%%2=1
>> > >
>> > > What aren't these numbers printing ?
>> > >
>> > > num<-0
>> > > for (i in 1:100){

Re: [R] Need to compare two columns in two data.frames and return all rows from df where rows values are missing

2021-06-13 Thread William Michels via R-help
Maybe something like this?

> df_A <- data.frame(names=LETTERS[1:10], values_A=1:10)
> df_B <- data.frame(names=LETTERS[6:15], values_B=11:20)
> df_AB <- merge(df_A, df_B, by="names")
> df_AAB <- merge(df_A, df_AB, all.x=TRUE)
> df_BAB <- merge(df_B, df_AB, all.x=TRUE)
> df_C <- df_AAB[is.na(df_AAB$values_B), ]
> df_D <- df_BAB[is.na(df_BAB$values_A), ]
> df_AB
  names values_A values_B
1 F6   11
2 G7   12
3 H8   13
4 I9   14
5 J   10   15
> df_C
  names values_A values_B
1 A1   NA
2 B2   NA
3 C3   NA
4 D4   NA
5 E5   NA
> df_D
   names values_B values_A
6  K   16   NA
7  L   17   NA
8  M   18   NA
9  N   19   NA
10 O   20   NA
>

HTH, Bill.

W. Michels, Ph.D.











On Sun, Jun 13, 2021 at 2:38 PM Gregg Powell via R-help
 wrote:
>
> This is even complicated to write into a question
>
> Have two data.frames (A and B)
>
> data.frame A and B each have a name column. Want to compare A and B 
> data.frame to each other based on the values in the 'names' columns - for 
> every name that appears in dataframe A  but not B, I want to copy the 
> corresponding rows to a third dataframe C, and for every name that appears in 
> B but not A, I want to copy the corresponding rows to a fourth dataframe D. I 
> can then row bind the dataframes C and D together and get a complete list of 
> all the rows that were missing in either A or B.
>
> try as I might - I can't get this to work. Can someone help P-L-E-A-S-E
>
>
>  Thanks,
>  Gregg Powell
>  AZ, USA__
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to spot/stop making the same mistake

2021-06-23 Thread William Michels via R-help
Hi Phillips,

Maybe these examples will be useful:

> vec <- c("a","b","c","d","e")
> vec[c(1,1,1,0,0)]
[1] "a" "a" "a"
> vec[c(1,1,1,2,2)]
[1] "a" "a" "a" "b" "b"
> vec[c(5,5,5,5,5)]
[1] "e" "e" "e" "e" "e"
> vec[c(NA,NA,NA,0,0,0,0)]
[1] NA NA NA
> vec[c(NA,NA,NA,1,1,1,1)]
[1] NA  NA  NA  "a" "a" "a" "a"
> vec[c(7:9)]
[1] NA NA NA
>
> R.version.string
[1] "R version 3.6.3 (2020-02-29)"

HTH, Bill.

W. Michels, Ph.D.








On Wed, Jun 23, 2021 at 10:39 AM Phillips Rogfield
 wrote:
>
> Dear all,
>
> thank for for your suggestion.
>
> Yes I come from languages where 1 means TRUE and 0 means FALSE. In
> particular from C/C++ and Python.
>
> Evidently this is not the case for R.
>
> In my mind I kind took for granted that that was the case (1=TRUE, 0=FALSE).
>
> Knowing this is not the case for R makes things simpler.
>
> Mine was just an example, sometimes I load datasets taken from outside
> and variables are coded with 1/0 (for example, a treatment variable may
> be coded that way).
>
> I also did not know the !!() syntax!
>
> Thank you for your help and best regards.
>
> On 23/06/2021 17:55, Bert Gunter wrote:
> > Just as a way to save a bit of typing, instead of
> >
> > > as.logical(0:4)
> > [1] FALSE  TRUE  TRUE  TRUE  TRUE
> >
> > > !!(0:4)
> > [1] FALSE  TRUE  TRUE  TRUE  TRUE
> >
> > DO NOTE that the parentheses in the second expression should never be
> > omitted, a possible reason to prefer the as.logical() construction.
> > Also note that !!  "acts [only] on raw, logical and number-like
> > vectors," whereas as.logical() is more general. e.g. (from ?logical):
> >
> > > charvec <- c("FALSE", "F", "False", "false","fAlse", "0",
> > +  "TRUE",  "T", "True",  "true", "tRue",  "1")
> > > as.logical(charvec)
> >  [1] FALSE FALSE FALSE FALSENANA  TRUE  TRUE  TRUE  TRUENA
> >NA
> > > !!charvec
> > Error in !charvec : invalid argument type
> >
> >
> > Cheers,
> > Bert
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Jun 23, 2021 at 8:31 AM Eric Berger  > > wrote:
> >
> > In my code, instead of 't', I name a vector of indices with a
> > meaningful
> > name, such as idxV, to make it obvious.
> >
> > Alternatively, a minor change in your style would be to replace your
> > definition of t by
> >
> > t <- as.logical(c(1,1,1,0,0))
> >
> > HTH,
> > Eric
> >
> >
> > On Wed, Jun 23, 2021 at 6:11 PM Phillips Rogfield
> > mailto:thebudge...@gmail.com>>
> > wrote:
> >
> > > I make the same mistake all over again.
> > >
> > > In particular, suppose we have:
> > >
> > > a = c(1,2,3,4,5)
> > >
> > > and a variable that equals 1 for the elements I want to select:
> > >
> > > t = c(1,1,1,0,0)
> > >
> > > To select the first 3 elements.
> > >
> > > The problem is that
> > >
> > > a[t]
> > >
> > > would repeat the first element 3 times .
> > >
> > > I have to either convert `t` to boolean:
> > >
> > > a[t==1]
> > >
> > > Or use `which`
> > >
> > > a[which(t==1)]
> > >
> > > How can I "spot" this error?
> > >
> > > It often happens in long scripts.
> > >
> > > Do I have to check the type each time?
> > >
> > > Do you have any suggestions?
> > >
> > > __
> > > R-help@r-project.org  mailing list
> > -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > 
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > 
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org  mailing list --
> > To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > 
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > 
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__

Re: [R] How to solve this?

2021-08-31 Thread William Michels via R-help
Hello,

You may have more luck posting your question to the R-SIG-Geo mailing-list:

https://stat.ethz.ch/mailman/listinfo/R-SIG-Geo/

Be sure to use an appropriate "Subject" line, for example, the
particular package/function that seems problematic.

HTH, Bill.

W. Michels, Ph.D.



On Mon, Aug 30, 2021 at 1:36 AM SITI AISYAH ZAKARIA
 wrote:
>
> Dear all,
>
> Can anyone help me? I'm using this coding to get the spatial gev model and
> plotting the quantile plot to justify the model is fit but the plot look
> like not fit.. How can I solve this? It is I need to change in the any
> value at the coding.For example I want to change form.shape <-shape ~1 to
> form.shape <- shape ~average shape but is not available for average coding.
>
> below is my coding
>
> #--
> # fitspatgev
> #--
> # response surface model
> Ozone<-data.matrix(Ozone_S)
> Ozone
> LotLatAlt<-data.matrix(OzoneLotLatAlt_S)
> LotLatAlt
> form.loc <- loc ~ Lon + Lat + Alt
> form.scale <- scale ~ 1
> form.shape <- shape ~ 1
> dim(Ozone_S)
> dim(OzoneLotLatAlt_S)
> fit1 <- fitspatgev(Ozone, scale(LotLatAlt,scale=FALSE), form.loc,
> form.scale, form.shape);fit1
> TIC(fit1)
> fit1$param
> #data(rain
> #symbolplot(rain, coord, plot.border = swiss)
> #check the fit of the model: compute QQplots for each station
> par(mfrow=c(1,3))
> par(mar=c(3,2.5,1.5,0.5),mgp=c(1.5,0.5,0),font.main=1,cex=0.66,cex.main=1)
> #calculation of confidance intervals
> nc <- 1
> M1 <- matrix(rfrechet(nc*nobs),nrow=nobs,ncol=nc)
> M <- t(apply(M1,2,sort))
> E <- boot::envelope(mat=M) #compute 95% confidance bands
> for (k in c(1:3,4,5,6)){ #choose some stations
>   park <- predict(fit1)[k,]
>   fk <- gev2frech(Ozone[,k],loc=park[4],scale=park[5],shape=park[6])
>   qqplot(y=fk,x=qfrechet((1:nobs)/(nobs+1)),log='xy',main=k,ylab='Sample
> Quantiles',xlab='Theoretical Quantiles',cex=0.7);
>   abline(0,1)
>   lines(y=E$overall[1,],x=qfrechet((1:nobs)/(nobs +1)),lty='dotted')
>   lines(y=E$overall[2,],x=qfrechet((1:nobs)/(nobs +1)),lty='dotted')
> }
>
> the quantile plot is in attachment file.
> please, can anyone help me?
>
> thank you
>
> --
>
>
>
>
>
> "..Millions of trees are used to make papers, only to be thrown away
> after a couple of minutes reading from them. Our planet is at stake. Please
> be considerate. THINK TWICE BEFORE PRINTING THIS.."
>
> DISCLAIMER: This email
> and any files transmitted with it are confidential and intended solely for
> the use of the individual orentity to whom they are addressed. If you have
> received this email in error please notify the UniMAP's Email
> Administrator. Please note that any views or opinions presented in this
> email are solely those of the author and do not necessarily represent those
> of the university. Finally, the recipient should check this email and any
> attachments for the presence of viruses.The university accepts no liability
> for any damage caused by any virus transmitted by this email.
>
> Universiti
> Malaysia Perlis (UniMAP) | Digital Management & Development Centre (DMDC),
> Universiti Malaysia Perlis (UniMAP), Pauh Putra Campus, 02600 Arau, Perlis,
> MALAYSIA | www.unimap.edu.my 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to install npsm package

2021-09-01 Thread William Michels via R-help
Hi,

I found package "npsm" at the links below:

https://mran.microsoft.com/snapshot/2017-02-04/web/packages/npsm/index.html
https://cran.r-project.org/src/contrib/Archive/npsm/

HTH, Bill.

W. Michels, Ph.D.


On Wed, Sep 1, 2021 at 8:27 AM  wrote:
>
> I need to install the package "npsm" to follow Kloke & McKean book. However,
> npsm is no longer on CRAN. So, please let me know in detail how to proceed
> to install it.
>
>
>
> Thanks.
>
>
>
> Carlos Gonzalez
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Connect R with Ensemble Rest API to fetch variations

2022-01-03 Thread William Michels via R-help
Hello Anas,

You can find courses and/or training materials on the Ensembl/EBI
websites, including R code:

https://www.ebi.ac.uk/training/online/courses/ensembl-rest-api/
http://training.ensembl.org/

You can also click on individual 'Ensembl REST API Endpoints', and
find sample R code there:

http://rest.ensembl.org/

Good luck,

Bill.

W. Michels, Ph.D.





On Mon, Jan 3, 2022 at 5:08 PM Anas Jamshed  wrote:
>
> I have a list of 422 genes in excel file .I want to know if its possible to
> get snps details of these gene by using Ensemble Rest API or biomart?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trying to understand the magic of lm

2019-05-09 Thread William Michels via R-help
Hello John,

Others have commented on the first half of your question, but the
second half of your question looks very much like R's built-in
predict() functions:

>?predict
>?predict.lm

Best Regards,

Bill.

W. Michels, Ph.D.



On Wed, May 8, 2019 at 6:23 PM Sorkin, John  wrote:
>
> Can someone send me something I can read about passing parameters so I can 
> understand how lm manages to have a dataframe passed to it, and use columns 
> from the dataframe to set up a regression. I have looked at the code for lm 
> and don't understand what I am reading. What I want to do is something like 
> the following,
>
>
> myfunction <- function(y,x,dataframe){
>
>   fit0 <- lm(y~x,data=dataframe)
>   print (summary(fit0))
> }
>
> # Run the function using dep and ind as dependent and independent variables.
> mydata <- data.frame(dep=c(1,2,3,4,5),ind=c(1,2,4,5,7))
> myfunction(dep,ind)
> # Run the function using outcome and predictor as dependent and independent 
> variables.
> newdata <- data.frame(outcome=c(1,2,3,4,5),predictor=c(1,2,4,5,7))
> myfunction(outcome,predictor)
>
>
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with R coding

2019-05-22 Thread William Michels via R-help
Morning Bill, I take it this is dplyr? You might try:

tmp1 <- HCPC %>%
group_by(HCPCSCode) %>%
summarise(Avg_AllowByLimit =
mean(Avg_AllowByLimit[which(Avg_AllowByLimit!=0 & AllowByLimitFlag ==
TRUE)]))

The code above gives "NaN" for cases where AllowByLimitFlag == FALSE.
Maybe this is the answer you desire, otherwise you can filter out
"NaN" rows.

Cheers,

Bill.

W. Michels, Ph.D.


On Wed, May 22, 2019 at 5:41 AM Bill Poling  wrote:
>
> Good morning.
>
> #R version 3.6.0 Patched (2019-05-19 r76539)
> #Platform: x86_64-w64-mingw32/x64 (64-bit)
> #Running under: Windows >= 8 x64 (build 9200)
>
> I need a calculated field  For the Rate of Avg_AllowByLimit where the 
> Allowed_AmtFlag = TRUE BY Each Code
>
> I have almost got this.
>
> #So far I have this
> tmp1 <- tmp %>%
> group_by(HCPCSCode) %>%
> summarise(Avg_AllowByLimit = 
> mean(Avg_AllowByLimit[which(Avg_AllowByLimit!=0)]))
>
> # But I need Something like that + This
>
> WHERE AllowByLimitFlag == TRUE
>
> I cannot seem to get it in there correctly
>
> Thank you for any help
>
> WHP
>
> #Here is some data
> HCPCSCode Avg_AllowByLimit AllowByLimitFlag
> 1  J1745 4.50FALSE
> 2  J929918.70FALSE
> 3  J930614.33FALSE
> 4  J9355 7.13FALSE
> 5  J0897 8.61FALSE
> 6  J9034 3.32FALSE
> 7  J9034 3.32FALSE
> 8  J904515.60FALSE
> 9  J9035 2.77 TRUE
> 10 J1190 3.62FALSE
> 11 J2250   879.10FALSE
> 12 J9033 2.92FALSE
> 13 J1745 4.50 TRUE
> 14 J278512.11FALSE
> 15 J904515.60FALSE
> 16 J2350 7.81FALSE
> 17 J246910.65 TRUE
> 18 J2796 6.27FALSE
> 19 J2796 6.27FALSE
> 20 J9355 7.13FALSE
> 21 J904515.60FALSE
> 22 J2505 2.73FALSE
> 23 J1786 2.81FALSE
> 24 J3262 3.26FALSE
> 25 J0696   168.87FALSE
> 26 J0178 1.52 TRUE
> 27 J9271 5.55FALSE
> 28 J338080.99FALSE
> 29 J9355 7.13 TRUE
> 30 J246910.65FALSE
> 31 J904515.60FALSE
> 32 J1459 3.64FALSE
> 33 J9305 8.74FALSE
> 34 J9034 3.32FALSE
> 35 J9034 3.32FALSE
>
> Confidentiality Notice This message is sent from Zelis. ...{{dropped:13}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: new_index

2019-09-07 Thread William Michels via R-help
Hi Val, see below:

> dat1 <-read.table(text="ID, x, y, z
+ A, 10,  34, 12
+ B, 25,  42, 18
+ C, 14,  20,  8 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> dat2 <-read.table(text="ID, weight
+ A,  0.25
+ B,  0.42
+ C,  0.65 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> dat3 <- data.frame(ID = dat1[,1], Index = apply(dat1[,-1], 1, FUN= 
> function(x) {sum(x*dat2[,2])} ), stringsAsFactors=F)
> dat3
  ID Index
1  A 24.58
2  B 35.59
3  C 17.10
>
> str(dat3)
'data.frame': 3 obs. of  2 variables:
 $ ID   : chr  "A" "B" "C"
 $ Index: num  24.6 35.6 17.1
>

The first two results "A" and "B" are identical to your example, but
your math in "C" appears a little off.

HTH, Bill.

W. Michels, Ph.D.

On Sat, Sep 7, 2019 at 11:47 AM Val  wrote:
>
> Hi All,
>
> I have two data frames   with thousand  rows  and several columns. My
> samples of the data frames are shown below
>
> dat1 <-read.table(text="ID, x, y, z
> ID , x, y, z
> A, 10,  34, 12
> B, 25,  42, 18
> C, 14,  20,  8 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> dat2 <-read.table(text="ID, x, y, z
> ID, weight
> A,  0.25
> B,  0.42
> C,  0.65 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> My goal is to  create an index value  for each ID  by mutliplying the
> first row of dat1 by the second  column of dat2.
>
>   (10*0.25 ) + (34*0.42) + (12*0.65)=  24.58
>   (25*0.25 ) + (42*0.42) + (18*0.65)=  35.59
>   (14*0.25 ) + (20*0.42) + (  8*0.65)=  19.03
>
> The  desired out put is
> dat3
> ID, Index
> A 24.58
> B  35.59
> C  19.03
>
> How do I do it in an efficent way?
>
> Thank you,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] static vs. lexical scope

2019-09-26 Thread William Michels via R-help
The best summary I've read on the subject of R's scoping rules (in
particular how they compare to scoping rules in S-PLUS) is Dr. John
Fox's "Frames, Environments, and Scope in R and S-PLUS", written as an
Appendix to the first edition of his book, An R and S-PLUS Companion
to Applied Regression (2002).

In this document Dr. Fox refers to "lexical" scoping primarily,
however where "static" scoping is mentioned, it is defined as
equivalent to "lexical" scoping. The Appendix is available as a PDF
from:

https://socialsciences.mcmaster.ca/jfox/Books/Companion-1E/appendix-scope.pdf

HTH, Bill.

W. Michels, Ph.D.



On Thu, Sep 26, 2019 at 7:59 AM Duncan Murdoch  wrote:
>
> On 26/09/2019 9:44 a.m., Richard O'Keefe wrote:
> > Actually, R's scope rules are seriously weird.
> > I set out to write an R compiler, wow, >20 years ago.
> > Figured out how to handle optional and keyword parameters efficiently,
> > figured out a lot of other things, but choked on the scope rules.
> > Consider
> >
> >> x <- 1
> >> f <- function () {
> > +   a <- x
> > +   x <- 2
> > +   b <- x
> > +   c(a=a, b=b)
> > + }
> >> f()
> > a b
> > 1 2
> >> x
> > [1] 1
> >
> > It's really not clear what is going on here.
>
> This is all pretty clear:  in the first assignment, x is found in the
> global environment, because it does not exist in the evaluation frame.
> In the second assignment, a new variable is created in the evaluation
> frame.  In the third assignment, that new variable is used to set the
> value of b.
>
> > However, ?assign can introduce new variables into an environment,
> > and from something like
> >with(df, x*2-y)
> > it is impossible for a compiler to tell which, if either, of x and y is to
> > be obtained from df and which from outside.  And of course ?with
> > is just a function:
> >
> >> df <- data.frame(y=24)
> >> w <- with
> >> w(df, x*2-y)
> > [1] -22
> >
> > So you cannot in general tell *which* function can twist the environment
> > in which its arguments will be evaluated.
>
> It's definitely hard to compile R because of the scoping rules, but that
> doesn't make the scoping rules unclear.
>
> > I got very tired of trying to explore a twisty maze of documentation and
> > trying to infer a specification from examples.  I would come up with an
> > ingenious mechanism for making the common case tolerable and the
> > rare cases possible, and then I'd discover a bear trap I hadn't seen.
> > I love R, but I try really hard not to be clever with it.
>
> I think the specification is really pretty simple.  I'm not sure it is
> well documented anywhere, but I think I understand it pretty well, and
> it doesn't seem overly complicated to me.
>
> > So while R's scoping is *like* lexical scoping, it is *dynamic* lexical
> > scoping, to coin a phrase.
>
> I'd say it is regular lexical scoping but with dynamic variable
> creation. Call that dynamic lexical scoping if you want, but it's not
> really a mystery.
>
> Duncan Murdoch
>
> >
> > On Thu, 26 Sep 2019 at 23:56, Martin Møller Skarbiniks Pedersen
> >  wrote:
> >>
> >> On Wed, 25 Sep 2019 at 11:03, Francesco Ariis  wrote:
> >>>
> >>> Dear R users/developers,
> >>> while ploughing through "An Introduction to R" [1], I found the
> >>> expression "static scope" (in contraposition to "lexical scope").
> >>>
> >>> I was a bit puzzled by the difference (since e.g. Wikipedia conflates the
> >>> two) until I found this document [2].
> >>
> >>
> >> I sometimes teach a little R, and they might ask about static/lexical 
> >> scope.
> >> My short answer is normally that S uses static scoping and R uses
> >> lexical scoping.
> >> And most all modern languages uses lexical scoping.
> >> So if they know Java, C, C# etc. then the scoping rules for R are the same.
> >>
> >> I finally says that it is not a full answer but enough for most.
> >>
> >> Regards
> >> Martin
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
P

Re: [R] The "--slave" option ==> will become "--no-echo"

2019-09-27 Thread William Michels via R-help
Hi Martin,

'--no-echo'

or

'--no_echo'

Obviously you may prefer the first, but I hope you might consider the second.

Best Regards,

W. Michels, Ph.D.


On Fri, Sep 27, 2019 at 9:04 AM Martin Maechler
 wrote:
>
> > Martin Maechler
> > on Mon, 23 Sep 2019 16:14:36 +0200 writes:
>
> > Richard O'Keefe
> > on Sat, 21 Sep 2019 09:39:18 +1200 writes:
>
> >> Ah, *now* we're getting somewhere.  There is something
> >> that *can* be done that's genuinely helpful.
> >>> From the R(1) manual page:
> >> -q, --quiet Don't print startup message
>
> >> --silent Same as --quiet
>
> >> --slave Make R run as quietly as possible
>
> >> It might have been better to use --nobanner instead of
> >> --quiet.  So perhaps
>
> >> -q, --quiet Don't print the startup message.  This is
> >> the only output that is suppressed.
>
> >> --silent Same as --quiet.  Suppress the startup
> >> message only.
>
> >> --slave Make R run as quietly as possible.  This is
> >> for use when running R as a subordinate process.  See
> >> "Introduction to Sub-Processes in R"
> >> https://cran.r-project.org/web/packages/subprocess/vignettes/intro.html
> >> for an example.
>
> > Thank you, Stephen and Richard.
>
> > I think we (the R Core Team) *can* make the description a bit
> > more verbose. However, as practically all "--" descriptions
> > are fitting in one short line, (and as the 'subprocess' package is just 
> an
> > extension pkg, and may disappear (and more reasons)) I'd like to
> > be less verbose than your proposal.
>
> > What about
>
> > -q, --quiet   Don't print startup message
>
> > --silent  Same as --quiet
>
> > --slave   Make R run as quietly as possible.  For use when
> > runnning R as sub(ordinate) process.
>
> > If you look more closely, you'll notice that --slave is not much
> > quieter than --quiet, the only (?) difference being that the
> > input is not copied and (only "mostly") the R prompt is also not 
> printed.
>
> > And from my experiments (in Linux (Fedora 30)), one might even
> > notice that in some cases --slave prints the R prompt (to stderr?)
> > which one might consider bogous (I'm not: not wanting to spend
> > time fixing this platform-independently) :
>
> > --slave :
> > 
>
> > MM@lynne$ echo '(i <- 1:3)
> > i*10' | R-3.6.1 --slave --vanilla
> >> [1] 1 2 3
> > [1] 10 20 30
> > MM@lynne$ f=/tmp/Rslave.out$$; echo '(i <- 1:3)
> > i*10' | R-3.6.1 --slave --vanilla | tee $f
> >> [1] 1 2 3
> > [1] 10 20 30
> > MM@lynne$ cat $f
> > [1] 1 2 3
> > [1] 10 20 30
>
> > --quiet :
> > 
>
> > MM@lynne$ f=/tmp/Rquiet.out$$; echo '(i <- 1:3)
> > i*10' | R-3.6.1 --quiet --vanilla | tee $f
> >> (i <- 1:3)
> > [1] 1 2 3
> >> i*10
> > [1] 10 20 30
> >>
> > MM@lynne$ cat $f
> >> (i <- 1:3)
> > [1] 1 2 3
> >> i*10
> > [1] 10 20 30
> >>
> > MM@lynne$
>
> > 
>
> > But there's a bit more to it: In my examples above, both --quiet
> > and --slave where used together with --vanilla.  In general
> > --slave *also* never saves, i.e., uses the equivalent of
> > q('no'), where as --quiet does [ask or ...].
>
> > Last but not least, from very simply reading R's source code on
> > this, it becomes blatant that you can use  '-s'  instead of '--slave',
> > but we (R Core) have probably not documented that on purpose (so
> > we could reserve it for something more important, and redefine
> > the simple use of '-s' some time in the future ?)
>
> > So, all those who want to restrict their language could use '-s'
> > for now.  In addition, we could add  >> one <<  other alias to
> > --slave, say --subprocess (or --quieter ? or ???)
> > and one could make that the preferred use some time in the future.
>
> > Well, these were another two hours of time *not* spent improving
> > R technically, but spent reading e-mails, source code, and considering.
> > Maybe well spent, maybe not ...
>
> > Martin Maechler
> > ETH Zurich and R Core Team
>
> With in the   R Core Teamwe have considered the issue.
>
> As a consequence, I have committed a few minutes ago code changes
> that replace '--slave' by '--no-echo' .
> [This will be in R-devel versions from svn rev 77229 and of
>  course in the "big step" release around April 2020].
>
> Among other considerations, we found that  '--no-echo' was
> really much more self-explaining, as indeed the command line
> option turns off the echo'ing of the R code that is executed,
> and on the C level is indeed very much related to R level
>
> options(echo = "no")
>
> For back compatibility reasons, the old command line option will
> continue to work so the many shell and o

Re: [R] The "--slave" option ==> will become "--no-echo"

2019-09-27 Thread William Michels via R-help
Apologies, Duncan and Martin. I didn't check "R --help" first. You're
quite right, lots of embedded hyphens.

Best Regards, Bill.

W. Michels, Ph.D.

On Fri, Sep 27, 2019 at 2:42 PM Duncan Murdoch  wrote:
>
> On 27/09/2019 5:36 p.m., William Michels via R-help wrote:
> > Hi Martin,
> >
> > '--no-echo'
> >
> > or
> >
> > '--no_echo'
> >
> > Obviously you may prefer the first, but I hope you might consider the 
> > second.
>
> Are you serious?  That's a terrible suggestion.  Run "R --help" and
> you'll see *no* options with underscores, and a dozen with embedded hyphens.
>
> Duncan Murdoch
>
> >
> > Best Regards,
> >
> > W. Michels, Ph.D.
> >
> >
> > On Fri, Sep 27, 2019 at 9:04 AM Martin Maechler
> >  wrote:
> >>
> >>>>>>> Martin Maechler
> >>>>>>>  on Mon, 23 Sep 2019 16:14:36 +0200 writes:
> >>
> >>>>>>> Richard O'Keefe
> >>>>>>>  on Sat, 21 Sep 2019 09:39:18 +1200 writes:
> >>
> >>  >> Ah, *now* we're getting somewhere.  There is something
> >>  >> that *can* be done that's genuinely helpful.
> >>  >>> From the R(1) manual page:
> >>  >> -q, --quiet Don't print startup message
> >>
> >>  >> --silent Same as --quiet
> >>
> >>  >> --slave Make R run as quietly as possible
> >>
> >>  >> It might have been better to use --nobanner instead of
> >>  >> --quiet.  So perhaps
> >>
> >>  >> -q, --quiet Don't print the startup message.  This is
> >>  >> the only output that is suppressed.
> >>
> >>  >> --silent Same as --quiet.  Suppress the startup
> >>  >> message only.
> >>
> >>  >> --slave Make R run as quietly as possible.  This is
> >>  >> for use when running R as a subordinate process.  See
> >>  >> "Introduction to Sub-Processes in R"
> >>  >> 
> >> https://cran.r-project.org/web/packages/subprocess/vignettes/intro.html
> >>  >> for an example.
> >>
> >>  > Thank you, Stephen and Richard.
> >>
> >>  > I think we (the R Core Team) *can* make the description a bit
> >>  > more verbose. However, as practically all "--" descriptions
> >>  > are fitting in one short line, (and as the 'subprocess' package is 
> >> just an
> >>  > extension pkg, and may disappear (and more reasons)) I'd like to
> >>  > be less verbose than your proposal.
> >>
> >>  > What about
> >>
> >>  > -q, --quiet   Don't print startup message
> >>
> >>  > --silent  Same as --quiet
> >>
> >>  > --slave   Make R run as quietly as possible.  For use when
> >>  > runnning R as sub(ordinate) process.
> >>
> >>  > If you look more closely, you'll notice that --slave is not much
> >>  > quieter than --quiet, the only (?) difference being that the
> >>  > input is not copied and (only "mostly") the R prompt is also not 
> >> printed.
> >>
> >>  > And from my experiments (in Linux (Fedora 30)), one might even
> >>  > notice that in some cases --slave prints the R prompt (to stderr?)
> >>  > which one might consider bogous (I'm not: not wanting to spend
> >>  > time fixing this platform-independently) :
> >>
> >>  > --slave :
> >>  > 
> >>
> >>  > MM@lynne$ echo '(i <- 1:3)
> >>  > i*10' | R-3.6.1 --slave --vanilla
> >>  >> [1] 1 2 3
> >>  > [1] 10 20 30
> >>  > MM@lynne$ f=/tmp/Rslave.out$$; echo '(i <- 1:3)
> >>  > i*10' | R-3.6.1 --slave --vanilla | tee $f
> >>  >> [1] 1 2 3
> >>  > [1] 10 20 30
> >>  > MM@lynne$ cat $f
> >>  > [1] 1 2 3
> >>  > [1] 10 20 30
> >>
> >>  > --quiet :
> >>  > 
> >>
> >>  > MM@lynne$ f=/tmp/Rquiet.out$$; echo '(i <- 1:3)
> >>

Re: [R] can not extract rows which match a string

2019-10-03 Thread William Michels via R-help
Hello,

I expected the code you posted to work just as you presumed it would,
but without a reproducible example--I can only speculate as to why it
didn't.

In the t1 dataframe, if indeed you only want to remove rows of the
t1$sex_chromosome_aneuploidy_f22019_0_0 column which are undefined,
you could try the following:

> t11 <- t1[ !is.na(t1$sex_chromosome_aneuploidy_f22019_0_0), ]

HTH, Bill.

W. Michels, Ph.D.



On Thu, Oct 3, 2019 at 11:59 AM Ana Marija  wrote:
>
> Hello,
>
> I have a dataframe (t1) with many columns, but the one I care about it this:
> > unique(t1$sex_chromosome_aneuploidy_f22019_0_0)
> [1] NA"Yes"
>
> it has these two values.
>
> I would like to remove from my dataframe t1 all rows which have "Yes"
> in t1$sex_chromosome_aneuploidy_f22019_0_0
>
> I tried selecting those rows with "Yes" via:
>
> t11=t1[t1$sex_chromosome_aneuploidy_f22019_0_0=="Yes",]
>
> but I got t11 which has the exact same number of rows as t1.
>
> If I do:
> > table(t1$sex_chromosome_aneuploidy_f22019_0_0)
>
> Yes
> 620
>
> So there is for sure 620 rows which have "Yes". How to remove those
> from my t1 data frame?
>
> Thanks
> Ana
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can not extract rows which match a string

2019-10-04 Thread William Michels via R-help
Apologies Ana, Of course Rui and Herve (and Richard) are correct here
in stating that NA values get 'carried through' when selecting using
the "==" operator.

To give an illustration of what (I believe) Herve means by "NAs
propagating", here's a small 11 x 8 dataframe ("zakaria") posted to
R-Help last year, which fortuitously has one column ("PO2T")
containing only the numeric value 50 as well as NAs. I compare
selecting with the "%in%" operator (as Herve suggests) and selecting
with the "==" operator. Notice the "propagating NAs" (last line of
code):

https://stat.ethz.ch/pipermail/r-help/2018-October/456798.html

> dim(zakaria)
[1] 11  8
> zakaria
   STUDENT_ID COURSE_CODE   PO1M PO1T PO2M PO2T  X X.1
1 AA15285 BAA1113 155.70  180   NA   NA NA  NA
2 AA15285 BAA1322  48.90   70   NA   NA NA  NA
3 AA15285 BAA2713  83.20  100   NA   NA NA  NA
4 AA15285 BAA2921 NA   NA   37   50 NA  NA
5 AA15285 BAA4273 NA   NA   NA   NA NA  NA
6 AA15285 BAA4513 NA   NA   NA   NA NA  NA
7 AA15286 BAA1322  48.05   70   NA   NA NA  NA
8 AA15286 BAA2113  68.40  100   NA   NA NA  NA
9 AA15286 BAA2513  41.65   60   NA   NA NA  NA
10AA15286 BAA2713  82.35  100   NA   NA NA  NA
11AA15286 BAA2921 NA   NA   41   50 NA  NA
> unique(zakaria$PO2T)
[1] NA 50
> table(zakaria$PO2T, exclude=NULL)

  50 
   29
> zakaria[!is.na(zakaria$PO2T), ]
   STUDENT_ID COURSE_CODE PO1M PO1T PO2M PO2T  X X.1
4 AA15285 BAA2921   NA   NA   37   50 NA  NA
11AA15286 BAA2921   NA   NA   41   50 NA  NA
> zakaria[zakaria$PO2T %in% 50, ]
   STUDENT_ID COURSE_CODE PO1M PO1T PO2M PO2T  X X.1
4 AA15285 BAA2921   NA   NA   37   50 NA  NA
11AA15286 BAA2921   NA   NA   41   50 NA  NA
> zakaria[zakaria$PO2T==50, ]
 STUDENT_ID COURSE_CODE PO1M PO1T PO2M PO2T  X X.1
NANA   NA   NA   NA NA  NA
NA.1  NA   NA   NA   NA NA  NA
NA.2  NA   NA   NA   NA NA  NA
4   AA15285 BAA2921   NA   NA   37   50 NA  NA
NA.3  NA   NA   NA   NA NA  NA
NA.4  NA   NA   NA   NA NA  NA
NA.5  NA   NA   NA   NA NA  NA
NA.6  NA   NA   NA   NA NA  NA
NA.7  NA   NA   NA   NA NA  NA
NA.8  NA   NA   NA   NA NA  NA
11  AA15286 BAA2921   NA   NA   41   50 NA  NA
>

I am certainly taking Herve's advice seriously, but I also believe
that when importing data into R, carefully setting parameters such as
the "na.strings" parameter of read.table() can help you avoid
surprises later on.

HTH, Bill.

W. Michels, Ph.D.

On Thu, Oct 3, 2019 at 1:34 PM Rui Barradas  wrote:

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change colour ggiNEXT plot package iNEXT

2019-10-23 Thread William Michels via R-help
Apparently, the iNEXT package was first described in an academic paper
published in 2016, although CRAN archives go back to 2015.
http://chao.stat.nthu.edu.tw/wordpress/paper/120_pdf_appendix.pdf
https://cran.r-project.org/src/contrib/Archive/iNEXT/

The vignette below has a section entitled "General Customization"
which talks about color. See the four lines of code I've added to the
vignette's code to get a general idea what to do.
https://cran.r-project.org/web/packages/iNEXT/vignettes/Introduction.html

library(iNEXT)
library(ggplot2)
library(gridExtra)
library(grid)
data("spider")
out <- iNEXT(spider, q=0, datatype="abundance")
g <- ggiNEXT(out, type=1, color.var = "site")
print(g)
g1 <- g + scale_colour_manual(values=c("yellow", "green"))
print(g1)
g2 <- g1 + scale_fill_manual(values=c("yellow", "green"))
print(g2)

HTH, Bill.

W. Michels, Ph.D.




On Wed, Oct 23, 2019 at 11:13 AM David Winsemius  wrote:
>
>
> On 10/22/19 12:48 PM, Luigi Marongiu wrote:
> > I thought it was a major package for ecological analysis.
>
>
> Yours is the first question in 20 years of Rhelp about the package iNEXT.
>
>
> --
>
> David
>
> > Anyway,
> > thank you for the tips. I'll dip from there.
> >
> > On Tue, Oct 22, 2019 at 5:29 PM Jeff Newmiller  
> > wrote:
> >> Probably, assuming that function returns a ggplot object. You will need to 
> >> identify the levels of the factor used for distinguishing groups, and add 
> >> a scale_colour_manual() to the ggplot object with colors specified in the 
> >> same order as those levels.
> >>
> >> Support for obscure packages is technically off-topic here ... if you need 
> >> a more specific answer you may need to correspond with the package authors 
> >> or use their suggested support resources.
> >>
> >> On October 22, 2019 2:18:49 AM PDT, Luigi Marongiu 
> >>  wrote:
> >>> Dear all,
> >>> is it possible to provide custom color to the rarefaction curve of the
> >>> package iNEXT (ggiNEXT)?
> >>> If I have these data:
> >>> ```
> >>> library(iNEXT)
> >>> library(ggplot2)
> >>> data(spider)
> >>> out <- iNEXT(spider, q=0, datatype="abundance")
> >>> ggiNEXT(out, type=1)
> >>> ```
> >>> can i colour the lines with, let's say, yellow and green?
> >>> Thank you
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] If Loop I Think

2019-10-24 Thread William Michels via R-help
Hi Phillip,

Jim and David and Petr all wrote you good code, but you have major
problems in data formatting. Your data uses spaces both as a column
separator and also to denote "blank fields". Because of problems with
your input data structure, it's doubtful whether the good code you've
received will result in the correct baseball answer.

The Arizona Diamondbacks data you posted shows runner positions for
about seven outs of a game (about 1-and-1/6 inning)--I say "about"
because there may be subsequent rows with the same number of outs
listed in row 14. However, rows 10/11 have two blank spaces between
the number-of-outs and a runner_ID (suggesting one "blank field" to
the left of the first runner_ID), while row 12 has three blank spaces
between the number-of-outs and the first runner_ID (suggesting two
"blank fields" to the left of the first runner_ID).

Since bases are loaded in row 9 and no outs are recorded between rows
9 and 10, the game situation suggests that two runners score between
rows 9 and 10 (polla001 and perad001), with the remaining baserunners
ending up on second and third base, not first and second base (best
guess: batter lambj001 hits a double, winds up on second base, and
gets two RBIs). Similarly between rows 11 and 12, goldp001 is removed
as a baserunner and an out is recorded, however no new baserunners
appear. This game situation suggests both runners advancing (e.g. by a
sacrifice fly) with goldp001 scoring and the remaining baserunner
(lambj001) ending up on third base, not second base or first base.

Now if you run the code posted earlier using read.table(), in all
cases you will find blank fields removed between the "outs" column and
the first baserunner listed, so every row of your data with
runners-on-base will have a runner on first-base. Intuitively, you
know this must be wrong (think doubles and triples). The mechanics of
read.table() are such that the field separator character ("sep"
parameter) defaults to 'white space', that is to say, "ONE OR MORE
spaces, tabs, newlines or carriage returns" (capitalization mine). So
multiple white space characters in your file are read as a single
"field separator" separating two adjacent columns.

What you really need to do is export your data in a format that R can
easily understand. There's a possibility that posting your code in
HTML to the R-Help mailing list may have corrupted your data (e.g.
removing tabs and inserting spaces instead), but no matter. You need
to set up a workflow so this **cannot** happen, i.e. start exporting
from a spreadsheet program in ".csv" format and start importing into R
using R's read.csv() function instead. Colleagues have recommended the
book "Beyond Spreadsheets with R" by Dr. Jonathan Carroll to me as a
good introductory text for tackling these issues.

Finally (if you're read this far), the truth is if you work at it a
little bit, you can get the data you posted into R into a reasonable
format using lists (although starting from a ".csv" file may be
conceptually easier for you). Lists are very useful when you have
multiple vectors of different lengths. See the code below (note--I
dropped your first "Row#" column):

> zz <- textConnection("ari18.test3.raw", "w")
> writeLines(con=zz, c("0
+ 1
+ 1
+ 1 arenn001
+ 2 arenn001
+ 0
+ 0 perad001
+ 0 polla001 perad001
+ 0 goldp001 polla001 perad001
+ 0  lambj001 goldp001
+ 1  lambj001 goldp001
+ 2   lambj001
+ 0
+ 1   "))
> close(zz)
> ari18.test3.raw
 [1] "0   " "1   "
 [3] "1   " "1 arenn001  "
 [5] "2 arenn001  " "0   "
 [7] "0 perad001  " "0 polla001 perad001 "
 [9] "0 goldp001 polla001 perad001" "0  lambj001 goldp001"
[11] "1  lambj001 goldp001" "2   lambj001"
[13] "0   " "1   "
> aa <- strsplit(trimws(ari18.test3.raw), split=" ")
> bb <- t(sapply(aa, FUN=function(x) {c(x, rep(NA, length.out=4-length(x)))} ))
> cc <- t(apply(bb[,-1], 1, FUN=function(x) {ifelse(test=nchar(x), yes=1, 
> no=0)} ))
> bb
  [,1] [,2]   [,3]   [,4]
 [1,] "0"  NA NA NA
 [2,] "1"  NA NA NA
 [3,] "1"  NA NA NA
 [4,] "1"  "arenn001" NA NA
 [5,] "2"  "arenn001" NA NA
 [6,] "0"  NA NA NA
 [7,] "0"  "perad001" NA NA
 [8,] "0"  "polla001" "perad001" NA
 [9,] "0"  "goldp001" "polla001" "perad001"
[10,] "0"  "" "lambj001" "goldp001"
[11,] "1"  "" "lambj001" "goldp001"
[12,] "2"  "" "" "lambj001"
[13,] "0"  NA NA NA
[14,] "1"  NA NA NA
> cc
  [,1] [,2] [,3]
 [1,]   NA   NA   NA
 [2,]   NA   NA   NA
 [3,]   NA   NA   NA
 [4,]1   NA   NA
 [5,]1   NA   NA
 [6,]   NA   NA   NA
 [7,]1   NA   NA
 [8,]11   NA
 [9,]111
[10,]011
[11,]011
[12,]001
[13,]   NA   NA   NA
[14,]   NA   NA   NA
>

HTH,

Re: [R] If Loop I Think

2019-10-27 Thread William Michels via R-help
Hi Phillip,

I wanted to follow up with you regarding your earlier post. Below is a
different way to work up your data than I posted earlier.

I took the baseball data you posted, stripped out
leading-and-following blank lines, removed all trailing spaces on each
line, and removed the "R1", "R2" and "R3" column names, since they're
blank columns anyway. I then read this text file ("diamond2.txt") into
R using the read.table() call below. Note the use of the sep=" "
parameter--it is very important to include this parameter when
analyzing your dataset in R, as it is not the default setting. I was
then able to generate the "R1", "R2", "R3" columns you sought, using
apply() with anonymous functions:

> testAD <- read.table("diamond2.txt", header=T, sep=" ", na.strings="", 
> fill=T, row.names=NULL, stringsAsFactors=F)
> testAD$R1=rep(NA, 14)
> testAD$R2=rep(NA, 14)
> testAD$R3=rep(NA, 14)
> testAD[ ,c(6:8)] <- apply(testAD[ ,c(3:5)], 2, FUN=function(x) 
> {ifelse(test=nchar(x), yes=1, no=0)} )
> testAD[ ,c(6:8)] <- apply(testAD[ ,c(6:8)], 2, FUN=function(x) 
> {ifelse(test=!is.na(x), yes=x, no=0)} )
> testAD
   Row Outs RunnerFirst RunnerSecond RunnerThird R1 R2 R3
110   0  0  0
221   0  0  0
331   0  0  0
441arenn001   1  0  0
552arenn001   1  0  0
660   0  0  0
770perad001   1  0  0
880polla001 perad001  1  1  0
990goldp001 polla001perad001  1  1  1
10  100 lambj001goldp001  0  1  1
11  111 lambj001goldp001  0  1  1
12  122 lambj001  0  0  1
13  130   0  0  0
14  141   0  0  0
>

HTH,

Bill.

W. Michels, Ph.D.


On Thu, Oct 24, 2019 at 12:44 PM William Michels  wrote:
>
> Hi Phillip,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditions

2019-11-26 Thread William Michels via R-help
Hi Val,

Here's an answer using a series of ifelse() statements. Because the d4
column is created initially using NA as a placeholder, you can check
your conditional logic at the end using table(!is.na(dat2$d4)):

> dat2 <-read.table(text="ID  d1 d2 d3
+ A 0 25 35
+ B 12 22  0
+ C 0  0  31
+ E 10 20 30
+ F 0  0   0", header=TRUE, stringsAsFactors=F)
>
> dat2$d4 <- NA
> dat2$d4 <- with(dat2, ifelse(d1!=0, yes=d1, no=d4))
> dat2$d4 <- with(dat2, ifelse((d1==0 & d2!=0), yes=d2, no=d4))
> dat2$d4 <- with(dat2, ifelse((d1==0 & d2==0 & d3!=0), yes=d3, no=d4))
> dat2$d4 <- with(dat2, ifelse((d1==0 & d2==0 & d3==0), yes=0, no=d4))
>
> dat2
  ID d1 d2 d3 d4
1  A  0 25 35 25
2  B 12 22  0 12
3  C  0  0 31 31
4  E 10 20 30 10
5  F  0  0  0  0
>
> table(!is.na(dat2$d4))

TRUE
   5
>

Your particular conditionals don't appear sensitive to order, but
someone else using the same strategy may have to take care to run the
ifelse() statements in the correct (desired) order.

HTH, Bill.

W. Michels, Ph.D.



On Tue, Nov 26, 2019 at 3:15 PM Val  wrote:
>
> HI All, I am having a little issue in my ifelse statement,
> The data frame looks like as follow.
>
> dat2 <-read.table(text="ID  d1 d2 d3
> A 0 25 35
> B 12 22  0
> C 0  0  31
> E 10 20 30
> F 0  0   0",header=TRUE,stringsAsFactors=F)
> I want to create d4 and set the value based on the following conditions.
> If d1  !=0  then d4=d1
> if d1 = 0  and d2 !=0  then d4=d2
> if (d1  and d2 = 0) and d3  !=0 then d4=d3
> if all d1, d2 and d3 =0 then d4=0
>
> Here is the desired output and my attempt
>  ID d1 d2 d3 d4
>   A  0 25 35  25
>   B 12 22  0  12
>   C  0  0 31   31
>   E 10 20 30  10
>   F  0  0  0  0  0
>
> My attempt
> dat2$d4 <-  0
> dat2$d4  <- ifelse((dat2$d1 =="0"), dat2$d2, ifelse(dat2$d2 == "0"), dat2$d3, 
> 0)
> but not working.
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use preProcess in Caret?

2019-12-04 Thread William Michels via R-help
Hello,

Have you tried alternative methods of pre-processing your data, such
as simply calling scale()? What is the effect on convergence, for both
the caret package and and the neuralnet package? There's an example
using scale() with the neuralnet package at the link below:

https://datascienceplus.com/fitting-neural-network-in-r/

HTH, Bill.

W. Michels, Ph.D.



On Sun, Dec 1, 2019 at 10:04 AM Burak Kaymakci  wrote:
>
> Hello there,
>
> I am using caret and neuralnet to train a neural network to predict times
> table. I am using 'backprop' algorithm for neuralnet to experiment and
> learn.
>
> Before using caret, I've trained a neuralnet without using caret, I've
> normalized my input & outputs using preProcess with 'range' method. Then I
> predicted my test set, did the multiplication and addition on predictions
> to get the real values. It gave me good results.
>
> What I want to ask is, when I try to train my network using caret, I get an
> error saying algorithm did not converge. I am thinking that I might be
> doing something wrong with my pre-processing,
>
> How would I go about using preProcess in train?
> Do I pass my not-normalized data set to the train function and train
> function handles normalization internally?
>
> You can find my R gist here
> 
>
> Thank you,
> Burak
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package MCMCpack

2019-12-04 Thread William Michels via R-help
Hi William,

It's not clear to me why you need this particular older version of
MCMCpack. From the archive I find MCMCpack_1.2-4 dates back to
2012-06-14, and MCMCpack_1.2-4.1 dates back to 2013-04-07:

MCMCpack_1.2-4.1.tar.gz 2013-04-07 00:05 481K
MCMCpack_1.2-4.tar.gz 2012-06-14 12:36 482K

Have you tried newer versions of MCMCpack? While the newest version of
MCMCpack (1.4-5) may require R (≥ 3.6), I have sessionInfo() logs
suggesting that MCMCpack_1.4-3 (2018-05-15 09:54 672K) ran just fine
under R version 3.3.3 (2017-03-06) -- "Another Canoe". In fact, an
archived version of the MCMCpack_1.4-3 DESCRIPTION_file indicates that
MCMCpack_1.4-3 only requires  R (>= 2.10.0).

Hope this helps,

Bill.

W. Michels, Ph.D.





On Wed, Dec 4, 2019 at 7:34 AM Prophet, William
 wrote:
>
> Yes, I should have mentioned that I tried this same line of code and I still 
> got an error.
>
>
> -Original Message-
> From: Duncan Murdoch 
> Sent: Wednesday, December 4, 2019 8:32 AM
> To: Prophet, William ; r-help@r-project.org
> Subject: Re: [R] package MCMCpack
>
> Sent by an external sender
>
> --
> On 04/12/2019 8:47 a.m., Prophet, William wrote:
> > I am running R on my work computer with the following parameters:
> >> sessionInfo()
> > R version 3.5.3 (2019-03-11)
> > Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8
> > x64 (build 9200)
> >
> >
> > I am trying to install the "MCMCpack" library. In the process however, I 
> > receive the message:
> >
> >> install.packages("MCMCpack")
> >
> > Warning in install.packages :
> >
> >package 'MCMCpack' is not available (for R version 3.5.3)
> >
> >
> > I have tried to install earlier versions of the "MCMCpack" library which I 
> > obtained from the following website:
> > https://cran.r-project.org/src/contrib/Archive/MCMCpack/
> >
> > using the following code:
> > packageurl <- 
> > "https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.2-4.tar.gz";
> > install.packages(packageurl, type="source")
>
> That's not the way to install a tarball.  You should use
>
> install.packages(packageurl, type="source", repos=NULL)
>
> However, you may still have problems:  when I tried that, the install failed 
> because of C++ errors.  I don't know if configure options (e.g.
> specifying a particular version of C++) would have fixed the problems.
>
> Duncan Murdoch
>
> >
> > but I get an identical error message:
> >
> >> packageurl <- 
> >> "https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.2-4.tar.gz";
> >
> >> install.packages(packageurl, type="source")
> >
> > Warning in install.packages :
> >
> >package
> > 'https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.2-
> > 4.tar.gz' is not available (for R version 3.5.3)
> >
> >
> >
> > I realize that updating my R version to 3.6.1 could possibly solve the 
> > problem, but this would be impractical for me because my version needs to 
> > be consistent with others on my team. In any case, the fact that I am 
> > getting errors (detailed above) with *every* version of "MCMCpack" seems 
> > odd.
> >
> > Do you have any suggestions for how I can install a working version of 
> > "MCMCpack" for my version of R?
> >
> > Thank you,
> > Bill
> >
> >
> >
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package MCMCpack

2019-12-04 Thread William Michels via R-help
You can try installing the mcmc package first:

https://cran.r-project.org/web/packages/mcmc/index.html
https://cran.r-project.org/src/contrib/Archive/mcmc/

I've used mcmc_0.9-5 with MCMCpack_1.4-3 under R version 3.3.3.

HTH, Bill.

W. Michels, Ph.D.


On Wed, Dec 4, 2019 at 2:52 PM Prophet, William
 wrote:
>
> Thank you for your reply. I have tried various versions and I get an error in 
> each case. In the case of MCMCpack_1.4-3 although the error may be related to 
> something specific in my configuration. In any case, I get the following 
> error when I try to install that version:
>
> > packageurl <- 
> > "https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.4-3.tar.gz";
> > install.packages(packageurl, repos=NULL, type="source")
> trying URL 
> 'https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.4-3.tar.gz'
> Content type 'application/x-gzip' length 688047 bytes (671 KB)
> downloaded 671 KB
>
> ERROR: dependency 'mcmc' is not available for package 'MCMCpack'
> * removing 'C:/R/R-3.5.3/library/MCMCpack'
> In R CMD INSTALL
> Warning in install.packages :
>   installation of package 
> 'C:/Users/prophetw/AppData/Local/Temp/RtmpykZoRh/downloaded_packages/MCMCpack_1.4-3.tar.gz'
>  had non-zero exit status
>
>
>
>
>
> -Original Message-
> From: William Michels 
> Sent: Wednesday, December 4, 2019 3:45 PM
> To: Prophet, William 
> Cc: Duncan Murdoch ; r-help@r-project.org
> Subject: Re: [R] package MCMCpack
>
> Sent by an external sender
>
> --
> Hi William,
>
> It's not clear to me why you need this particular older version of MCMCpack. 
> From the archive I find MCMCpack_1.2-4 dates back to 2012-06-14, and 
> MCMCpack_1.2-4.1 dates back to 2013-04-07:
>
> MCMCpack_1.2-4.1.tar.gz 2013-04-07 00:05 481K MCMCpack_1.2-4.tar.gz 
> 2012-06-14 12:36 482K
>
> Have you tried newer versions of MCMCpack? While the newest version of 
> MCMCpack (1.4-5) may require R (≥ 3.6), I have sessionInfo() logs suggesting 
> that MCMCpack_1.4-3 (2018-05-15 09:54 672K) ran just fine under R version 
> 3.3.3 (2017-03-06) -- "Another Canoe". In fact, an archived version of the 
> MCMCpack_1.4-3 DESCRIPTION_file indicates that
> MCMCpack_1.4-3 only requires  R (>= 2.10.0).
>
> Hope this helps,
>
> Bill.
>
> W. Michels, Ph.D.
>
>
>
>
>
> On Wed, Dec 4, 2019 at 7:34 AM Prophet, William  
> wrote:
> >
> > Yes, I should have mentioned that I tried this same line of code and I 
> > still got an error.
> >
> >
> > -Original Message-
> > From: Duncan Murdoch 
> > Sent: Wednesday, December 4, 2019 8:32 AM
> > To: Prophet, William ;
> > r-help@r-project.org
> > Subject: Re: [R] package MCMCpack
> >
> > Sent by an external sender
> >
> > --
> > On 04/12/2019 8:47 a.m., Prophet, William wrote:
> > > I am running R on my work computer with the following parameters:
> > >> sessionInfo()
> > > R version 3.5.3 (2019-03-11)
> > > Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >=
> > > 8
> > > x64 (build 9200)
> > >
> > >
> > > I am trying to install the "MCMCpack" library. In the process however, I 
> > > receive the message:
> > >
> > >> install.packages("MCMCpack")
> > >
> > > Warning in install.packages :
> > >
> > >package 'MCMCpack' is not available (for R version 3.5.3)
> > >
> > >
> > > I have tried to install earlier versions of the "MCMCpack" library which 
> > > I obtained from the following website:
> > > https://cran.r-project.org/src/contrib/Archive/MCMCpack/
> > >
> > > using the following code:
> > > packageurl <- 
> > > "https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.2-4.tar.gz";
> > > install.packages(packageurl, type="source")
> >
> > That's not the way to install a tarball.  You should use
> >
> > install.packages(packageurl, type="source", repos=NULL)
> >
> > However, you may still have problems:  when I tried that, the install 
> > failed because of C++ errors.  I don't know if configure options (e.g.
> > specifying a particular version of C++) would have fixed the problems.
> >
> > Duncan Murdoch
> >
> > >
> > > but I get an identical error message:
> > >
> > >> packageurl <- 
> > >> "https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.2-4.tar.gz";
> > >
> > >> install.packages(packageurl, type="source")
> > >
> > > Warning in install.packages :
> > >
> > >package
> > > 'https://cran.r-project.org/src/contrib/Archive/MCMCpack/MCMCpack_1.
> > > 2- 4.tar.gz' is not available (for R version 3.5.3)
> > >
> > >
> > >
> > > I realize that updating my R version to 3.6.1 could possibly solve the 
> > > problem, but this would be impractical for me because my version needs to 
> > > be consistent with others on my team. In any case, the fact that I am 
> > > getting errors (detailed above) with *every* version of "MCMCpack" seems 
> > > odd.
> > >
> > > Do you have any suggestions for how I can 

Re: [R] Book Recommendation

2023-08-28 Thread William Michels via R-help
I'm a big fan of the sqldf package by Gabor Grothendieck:

"sqldf: Manipulate R Data Frames Using SQL"
https://CRAN.R-project.org/package=sqldf

The sqldf "README.html" converts to a 42 page PDF:
https://cran.r-project.org/web/packages/sqldf/readme/README.html

You can also find favorable blog posts for the sqldf package on the
web, notably a post (circa 2013) from Patrick Burns:
https://www.burns-stat.com/translating-r-sql-basics/

HTH,

Bill.

W. Michels, Ph.D.



On Mon, Aug 28, 2023 at 8:47 AM Stephen H. Dawson, DSL via R-help
 wrote:
>
> Good Morning,
>
>
> I am doing some research to develop a new course where I teach. I am
> looking for a book to use in the course content to teach accomplishing
> SQL in R.
>
> Does anyone know of a book on this topic to recommend for consideration?
>
>
> Thank You,
> --
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting Decimal numbers into Binary

2019-12-27 Thread William Michels via R-help
Hi Paul,

Since you start from strings, it's not clear to me where ASCII enters
the picture. If you really need ASCII, you can use the charToInt()
function in the "R.oo" package. Also there's the AsciiToInt() function
in the  "sfsmisc" package. If you just want to use R's native
as.numeric() conversion, there's the digitsBase() function in the
"sfsmisc" package:

> library(sfsmisc)
> digitsBase(as.numeric("63"), base = 2)
Class 'basedInt'(base = 2) [1:1]
 [,1]
[1,]1
[2,]1
[3,]1
[4,]1
[5,]1
[6,]1
>

HTH, Bill.

W. Michels, Ph.D.


On Fri, Dec 27, 2019 at 8:11 AM Marc Schwartz via R-help
 wrote:
>
>
> > On Dec 27, 2019, at 10:42 AM, Paul Bernal  wrote:
> >
> > Dear friends,
> >
> > Hope you are all doing well. I need to find a way to convert ascii numbers
> > to six digit binary numbers:
> >
> > I am working with this example, I converted the string to ascii, and
> > finally to decimal, but I am having trouble converting the decimal numbers
> > into their six digit binary representation. The code below is exactly what
> > I have so far:
> >
> > ascii_datformat <- utf8ToInt("133m@ogP00PD;88MD5MTDww@2D7k")
> > ascii_datformat
> >
> > Base <- ascii_datformat - 48
> >
> > ifelse(Base > 40, Base-8, Base)
> >
> > x <- rev(intToBits(Base))
> > dec2bin <- function(x) paste(as.integer(rev(intToBits(x))), collapse = "")
> > dec2bin
> >
> > any guidance will be greatly appreciated,
> >
> > Best regards,
> >
> > Paul
>
>
> You might look at the intToBin() function in Henrik's R.utils package on CRAN:
>
> https://cran.r-project.org/web/packages/R.utils/index.html
>
> Regards,
>
> Marc Schwartz
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rnoaa library

2019-12-30 Thread William Michels via R-help
Hi Jeff,

You might have better luck posting your question on the R-SIG-Geo
mailing list, or perusing their archive. I've found a thread
pertaining to the rnoaa package from August 2016, along with a
particularly informative reply (reply link below):

https://stat.ethz.ch/mailman/listinfo/R-SIG-Geo
https://stat.ethz.ch/pipermail/r-sig-geo/
https://stat.ethz.ch/pipermail/r-sig-geo/2016-August/024768.html

If the above links don't help, you might consider checking for open
(or even closed) issues on Github:

https://github.com/ropensci/rnoaa/issues

HTH, Bill.

W. Michels, Ph.D.



On Sun, Dec 29, 2019 at 11:51 AM Jeff Reichman  wrote:
>
> r-help Forum
>
> Anyone familiar with the "rnoaa" library?  I'm trying to pull NOAA  temp
> data. I have a key but when I run the code highlighted in yellow  ..
>
> Warning message:
> Sorry, no data found
>
> No matter what station_id I use.
>
> # library
> library(rnoaa)
> library(lubridate)
>
> # set key
> options(noaakey = "")
>
> start_date = "2018-01-15"
> end_date = "2018-01-31"
> station_id = "USW00013994"
>
> weather_data <- ncdc(datasetid='NORMAL_HLY', stationid=paste0('GHCND:',
> station_id),
>  datatypeid = "HLY-TEMP-NORMAL",
>  startdate = start_date, enddate = end_date, limit=500)
> data <- weather_data$data
>
> data$year <- year(data$date)
> data$month <- month(data$date)
> data$day <- day(data$date)
> # summarize to average daily temps
> aggregate(value ~ year + month + day, mean, data = data)
>
> Sincerely
>
> Jeff Reichman
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Carpentry - Creating a New SQLite Database

2020-01-10 Thread William Michels via R-help
Hi Phillip,

Skipping to the last few lines of your email, did you download a
program to look at Sqlite databases (independent of R) as listed
below? Maybe that program ("DB Browser for SQLite") and/or the
instructions below can help you locate your database directory:

https://datacarpentry.org/semester-biology/computer-setup/
https://datacarpentry.org/semester-biology/materials/sql-for-dplyr-users/

If you do have that program, and you're still seeing an error, you
might consider looking for similar issues at the appropriate
'datacarpentry' repository on Github (or posting a new issue
yourself):

https://github.com/datacarpentry/R-ecology-lesson/issues

Finally, I really feel you'll benefit from reading over the documents
pertaining to "R Data Import/Export" on the www.r-project.org website.
No disrespect to the people at 'datacarpentry', but you'll find
similar (and possibly, easier) R code to follow at section 4.3.1
'Packages using DBI' :

https://cran.r-project.org/doc/manuals/r-release/R-data.html

HTH, Bill.

W. Michels, Ph.D.




On Fri, Jan 10, 2020 at 10:32 AM Phillip Heinrich  wrote:
>
> Working my way through a tutorial named Data Carpentry 
> (https://datacarpentry.org/R-ecology-lesson/).  for the most part it is 
> excellent but I’m stuck on the very last section 
> (https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html).
>
> First, below are the packages I have loaded:
> [1] "forcats"   "stringr"   "purrr" "readr" "tidyr" "tibble"
> "ggplot2"   "tidyverse" "dbplyr""RMySQL""DBI"
> [12] "dplyr" "RSQLite"   "stats" "graphics"  "grDevices" "utils" 
> "datasets"  "methods"   "base"
>
>
> >
>
>
> Second,
> Second, is the text of the last section of the last chapter titled “Creating 
> a New SQLite Database”.
> Second, below is the text from the tutorial.  The black type is from the 
> tutorial.  The green and blue is the suggested R code.  My comments are in 
> red.
> Creating a new SQLite database
> So far, we have used a previously prepared SQLite database. But we can also 
> use R to create a new database, e.g. from existing csv files. Let’s recreate 
> the mammals database that we’ve been working with, in R. First let’s download 
> and read in the csv files. We’ll import tidyverse to gain access to the 
> read_csv() function.
>
> download.file("https://ndownloader.figshare.com/files/3299483";,
>   "data_raw/species.csv")
> download.file("https://ndownloader.figshare.com/files/10717177";,
>   "data_raw/surveys.csv")
> download.file("https://ndownloader.figshare.com/files/3299474";,
>   "data_raw/plots.csv")
> library(tidyverse)
> species <- read_csv("data_raw/species.csv")No problem here.  I’m pulling 
> three databases from the Web and saving them to a folder on my hard drive. 
> (...data_raw/species.csv) etc.surveys <- read_csv("data_raw/surveys.csv") 
> plots <- read_csv("data_raw/plots.csv")Again no problem.  I’m just creating 
> an R data files.  But here is where I loose it.  I’m creating something named 
> my_db_file from another file named portal-database-output with an sqlite 
> extension and then creating my_db from the My_db_file.  Not sure where the 
> sqlite extension file came from. Creating a new SQLite database with dplyr is 
> easy. You can re-use the same command we used above to open an existing 
> .sqlite file. The create = TRUE argument instructs R to create a new, empty 
> database instead.
>
> Caution: When create = TRUE is added, any existing database at the same 
> location is overwritten without warning.
>
> my_db_file <- "data/portal-database-output.sqlite"
> my_db <- src_sqlite(my_db_file, create = TRUE)Currently, our new database is 
> empty, it doesn’t contain any tables:
>
> my_db#> src:  sqlite 3.29.0 [data/portal-database-output.sqlite]
> #> tbls:To add tables, we copy the existing data.frames into the database one 
> by one:
>
> copy_to(my_db, surveys)
> copy_to(my_db, plots)
> my_dbI can follow the directions to fill in my_db but I have no idea how to 
> access the tables.  The text from the tutorial below says to check the 
> location of our database.  Huh!  Can someone give me some direction.  Thanks.
>
>
>
>
>
> If you check the location of our database you’ll see that data is 
> automatically being written to disk. R and dplyr not only provide easy ways 
> to query existing databases, they also allows you to easily create your own 
> databases from flat files!
>
>
>
> Here is where I loose it.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, s

Re: [R] File names for mac newby

2020-01-21 Thread William Michels via R-help
Hi David,

Often on a Mac you can "right click" (or on a laptop--press down with
two fingers), and a pop-up will give you the option to "Copy File
Path". (You can also find this option in a Finder window under the
"Finder -> Services" menu bar) .This is the path you should use to
import your file into R.

There's also a R-Help mailing list for Mac users (R-SIG-Mac):
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

HTH, Bill.

W. Michels, Ph.D.

On Tue, Jan 21, 2020 at 9:38 AM David  wrote:
>
> I moved to a mac a few months ago after years in windows, and I'm still
> learning basics.  I'm wanting to create a data frame based on a text
> file called HouseTemps.txt.  That's a file within one called house which
> is within one called ah.  That may further be in one called  Documents.
> I tried various lines like:
>
> temps <-
> read.table("c:\\Users\\DFP\\Documents\\ah\\house\\HouseTemps.txt",header=T,row.names=1)
>
> based on my windows DOS experience, but nothing I try works.  So my
> question is, what do complete file names look like in a mac?
>
> I tried Apple support, but they couldn't help me with R.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analyzing Baseball Data With R

2020-03-16 Thread William Michels via R-help
Hi Phillip,

Generally these problems come down to knowing/setting your working
directory. The first question is whether you have a directory named
"data" inside your "C:/Users/Owner/Documents" directory? You may need
to create this directory first, outside of R and/or RStudio (using
your Windows OS).

Below is some example code (on a Mac). You might want to try 1)
creating the "data" directory in Windows as mentioned earlier, 2)
setting your working directory to "C:/Users/Owner/Documents/data",
then 3) re-running your src_sqlite() command giving "pitchrx.sqlite”
as the database file name.

> getwd()
[1] "/Users/homedir/R_Dir"
> setwd("/Users/homedir/R_Dir/data")
Error in setwd("/Users/homedir/R_Dir/data") :
  cannot change working directory
>

HTH, Bill.

W. Michels, Ph.D.


On Mon, Mar 16, 2020 at 3:35 PM Phillip Heinrich  wrote:
>
> Can’t get past first step of Chapter 7 page 164.
>
> Opened a new RStudio window.  Loaded tidyverse and keyed in 
> library(tidyverse) which of course includes dplyr.  The working directory is: 
> C:/Users/Owner/Documents.
>
> Then keyed in: db <- src_sqlite(“data/pitchrx.sqlite”,create=TRUE)
>
> And got the following error: Error: Could not connect to database:
> unable to open database file
>
> Googled everything I could think of to find the sqlite function and the 
> pitchrx.sqlite empty data base.  Can someone give me some direction?
>
> I wondering if I have configured RStudio incorrectly.  Why doesn’t my by 
> RStudio point to the correct data file?
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tidyverse Question

2020-03-27 Thread William Michels via R-help
Dear Ista (and Phillip),

Ista, that's the exact same advice I gave Phillip over a week ago:

https://stat.ethz.ch/pipermail/r-help/2020-March/465994.html

Phillip, it doesn't make sense to post the same question under
different subject headings. While I'm convinced you're making a
sincere effort to learn R on your own--i.e. this isn't
homework--anyone trying to help you (Ista, Jeff, Bert, John, Stefan,
Duncan, etc.) will need feedback on what advice you've taken from
people on this list, and what the outcome was. So please provide it.

I do have one last suggestion, that you contact the author of the book
you're using to see if there has been an error in his/her coding
instructions. You can file an "Issue" on Github, and/or find the
author's email there and email him/her directly:

https://github.com/maxtoki/baseball_R
https://github.com/maxtoki

HTH, Bill.

W. Michels, Ph.D.





On Mon, Mar 23, 2020 at 3:37 PM Ista Zahn  wrote:
>
> Hi Phillip,
>
> On Mon, Mar 23, 2020 at 6:33 PM Phillip Heinrich  wrote:
> >
> > Can someone out there run the following code from the book Analyzing 
> > Baseball Data with R – Chapter 7 page 164?
> >
> > library(tidyverse)
> > db <- src_sqlite(“data/pitchrx.sqlite”,create=TRUE)
> >
> > Over the past two weeks this code has run correctly twice but I have gotten 
> > the following error dozens of times:
> >
> > Error: Could not connect to database:
> > unable to open database file
>
> Probably you working directory has no sub-directory named 'data'.
>
> Best,
> Ista
> >
> > I’m trying to figure out if the problem is with my computer or if the 
> > tidyverse package has been revised since this book was written.  I got the 
> > same error when I loaded R onto my wife’s Mac.
> >
> > The file pitchrx.sqlite loaded into my directory C:/Users/Owner/Documents.  
> > The data file db contains four xml files used later in the analysis.
> >
> > Thanks.
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ncol() vs. length() on data.frames

2020-03-31 Thread William Michels via R-help
Hi Ivan,

Like Ivan Krylov, I'm not aware of circumstances for simple dataframes
where ncol(DF) does not equal length(DF).

As I understand it, using ncol() versus length() is important when
you're examining an object returned from a function like sapply(),
since sapply() will simplify one-column dataframes to vectors. Much
has been made of this sapply() feature, but it's simple enough in your
code to test whether or not an object returned by sapply() has NULL
columns:

> dim(testDF)
[1] 14  8
> length(testDF)
[1] 8
> length(testDF[1])
[1] 1
> length(testDF[, 1])
[1] 14
>
> ncol(testDF)
[1] 8
> ncol(testDF[1])
[1] 1
> ncol(testDF[, 1])
NULL
> is.null(ncol(testDF[, 1]))
[1] TRUE
>

HTH, Bill.

W. Michels, Ph.D.






On Tue, Mar 31, 2020 at 7:11 AM Ivan Calandra  wrote:
>
> Thanks Ivan for the answer.
>
> So it confirms my first thought that these two functions are equivalent
> when applied to a "simple" data.frame.
>
> The reason I was asking is because I have gotten used to use length() in
> my scripts. It works perfectly and I understand it easily. But to be
> honest, ncol() is more intuitive to most users (especially the novice)
> so I was thinking about switching to using this function instead (all my
> data.frames are created from read.csv() or similar functions so there
> should not be any issue). But before doing that, I want to be sure that
> it is not going to create unexpected results.
>
> Thank you,
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> On 31/03/2020 16:00, Ivan Krylov wrote:
> > On Tue, 31 Mar 2020 14:47:54 +0200
> > Ivan Calandra  wrote:
> >
> >> On a simple data.frame (i.e. each element is a vector), ncol() and
> >> length() will give the same result.
> >> Are they just equivalent on such objects, or are they differences in
> >> some cases?
> > I am not aware of any exceptions to ncol(dataframe)==length(dataframe)
> > (in fact, ncol(x) is dim(x)[2L] and ?dim says that dim(dataframe)
> > returns c(length(attr(dataframe, 'row.names')), length(dataframe))), but
> > watch out for AsIs columns which can have columns of their own:
> >
> > x <- data.frame(I(volcano))
> > dim(x)
> > # [1] 87  1
> > length(x)
> > # [1] 1
> > dim(x[,1])
> > # [1] 87 61
> >
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subtracting Data Frame With a Different Number of Rows

2020-04-21 Thread William Michels via R-help
Hi Phillip,

You have two choices here: 1. Manually enter the missing rows into
your individual.df using rbind(), and cbind() the overall.df and
individual.df dataframes together (assuming the rows line up
properly), or 2. Use merge() to perform an SQL-like "Left Join", and
copy values from the "overall" columns to fill in missing values in
the "indiv" columns (imputation). Below is code starting from a .tsv
files showing the second (merge) method. Note: I've only included the
first 4 rows of data after the merge command (there are 24 rows
total):

> overall <- read.delim("overall.R", sep="\t")
> indiv <- read.delim("individual.R", sep="\t")
> merge(overall, indiv, all.x=TRUE, by.x=c("RunnerCode", "Outs"), 
> by.y=c("RunnerCode", "Outs"))

RunnerCode Outs X.x MeanRuns.x X.y MeanRuns.y
1   BasesEmpty0   1  0.5137615   1  0.4262295
2   BasesEmpty1   9  0.3963801   8  0.5238095
3   BasesEmpty2  17  0.4191011  15  0.3469388
4  BasesLoaded0   8  3.2173913  NA NA


HTH, Bill.

W. Michels, Ph.D.


On Tue, Apr 21, 2020 at 1:47 PM Phillip Heinrich  wrote:
>
> I have two small data frames of baseball data.  The first one is the mean
> number of runs that will score in each half inning for the 2018 Arizona
> Diamondbacks.  The second data frame is the same information but for only
> one player.  As you will see the individual player did not come up to bat
> any time during the season:
> with the bases loaded and no outs
> runners on first and third with one out
>
> Overall
>
> RunnerCodeOuts MeanRuns
> 1 Bases Empty 0   0.5137615
> 2 Runner:1st0   0.8967391
> 3 Runner:2nd   0   1.3018868
> 4 Runners:1st & 2nd0   1.6551724
> 5 Runner:3rd0   1.9545455
> 6 Runners:1st & 3rd 0   2.0571429
> 7 Runners:2nd & 3rd0   2.1578947
> 8 Bases Loaded0   3.2173913
> 9 Bases Empty  1   0.3963801
> 10 Runner:1st   1   0.6952596
> 11 Runner:2nd  1   0.9580838
> 12 Runners:1st & 2nd   1   1.4397163
> 13 Runner:3rd   1   1.5352113
> 14 Runners:1st & 3rd   11.5882353
> 15 Runners:2nd & 3rd  11.9215686
> 16 Bases Loaded  11.9193548
> 17 Bases Empty20.4191011
> 18 Runner:1st   20.5531915
> 19 Runner:2nd  20.8777293
> 20 Runners:1st & 2nd  2 0.9553073
> 21 Runner:3rd  2 1.2783505
> 22 Runners:1st & 3rd   2 1.5851064
> 23 Runners:2nd & 3rd  2 1.2794118
> 24 Bases Loaded 2  1.388235
>
> Individual Player
>
>   RunnerCode  Outs   MeanRuns
> 1 Bases Empty 0 0.4262295
> 2 Runner:1st0 1.320
> 3 Runner:2nd   0 1.2857143
> 4 Runners:1st & 2nd   0  0.5714286
> 5 Runner:3rd   0  2.000
> 6 Runners:1st & 3rd0  3.500
> 7 Runners:2nd & 3rd   0  1.000
> 8 Bases Empty 1  0.5238095
> 9 Runner:1st1  0.6578947
> 10 Runner:2nd 1  0.375
> 11 Runners:1st & 2nd 1   1.4285714
> 12 Runner:3rd 1   1.4285714
> 13 Runners:2nd & 3rd 1   0.667
> 14 Bases Loaded 1   3.000
> 15 Bases Empty   2   0.3469388
> 16 Runner:1st  2   0.1363636
> 17 Runner:2nd 2   0.7142857
> 18 Runners:1st & 2nd  2   1.667
> 19 Runner:3rd  2   1.250
> 20 Runners:1st & 3rd  22.1428571
> 21 Runners:2nd & 3rd 21.500
> 22 Bases Loaded 22.200
>
> RunnersCode is a factor
> Outs are integers
> MeanRuns is numerical data
>
> I would like to subtract the second from the first as a way to evaluate the
> players ability to produce runs. As part of this analysis I I would like to
> input the mean number of runs from the overall data frame into the two
> missing cells for the individual player:Bases Loaded no outs and 1st and 3rd
> one out.
>
> Can anyone give me some advise?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinf

Re: [R] iterators : checkFunc with ireadLines

2020-05-18 Thread William Michels via R-help


Hi Laurent,

Thank you for explaining your size limitations. Below is an example
using the read.fwf() function to grab the first column of your input
file (in 2000 row chunks). This column is converted to an index, and
the index is used to create an iterator useful for skipping lines when
reading input with scan(). (You could try processing your large file
in successive 2000 line chunks, or whatever number of lines fits into
memory). Maybe not as elegant as the approach you were going for, but
read.fwf() should be pretty efficient:

> sensors <-  c("N053", "N163")
> read.fwf("test2.txt", widths=c(4), as.is=TRUE, flush=TRUE, n=2000, skip=0)
V1
1 Time
2 N023
3 N053
4 N123
5 N163
6 N193
> first_col <- read.fwf("test2.txt", widths=c(4), as.is=TRUE, flush=TRUE, 
> n=2000, skip=0)
> which(first_col$V1 %in% sensors)
[1] 3 5
> index1 <- which(first_col$V1 %in% sensors)
> iter_index1 <- iter(1:2000, checkFunc= function(n) {n %in% index1})
> unlist(scan(file="test2.txt", what=list("","","","","","","","","",""), 
> flush=TRUE, multi.line=FALSE, skip=nextElem(iter_index1)-1, nlines=1, 
> quiet=TRUE))
 [1] "N053"  "-0.014083" "-0.004741" "0.001443"  "-0.010152"
"-0.012996" "-0.005337" "-0.008738" "-0.015094" "-0.012104"
> unlist(scan(file="test2.txt", what=list("","","","","","","","","",""), 
> flush=TRUE, multi.line=FALSE, skip=nextElem(iter_index1)-1, nlines=1, 
> quiet=TRUE))
 [1] "N163"  "-0.054023" "-0.049345" "-0.037158" "-0.04112"
"-0.044612" "-0.036953" "-0.036061" "-0.044516" "-0.046436"
>

(Note for this email and the previous one, I've deleted the first
"hash" character from each line of your test file for clarity).

HTH, Bill.

W. Michels, Ph.D.





On Mon, May 18, 2020 at 3:35 AM Laurent Rhelp  wrote:
>
> Dear William,
>   Thank you for your answer
> My file is very large so I cannot read it in my memory (I cannot use
> read.table). So I want to put in memory only the line I need to process.
> With readLines, as I did, it works but I would like to use an iterator
> and a foreach loop to understand this way to do because I thought that
> it was a better solution to write a nice code.
>
>
> Le 18/05/2020 à 04:54, William Michels a écrit :
> > Apologies, Laurent, for this two-part answer. I misunderstood your
> > post where you stated you wanted to "filter(ing) some
> > selected lines according to the line name... ." I thought that meant
> > you had a separate index (like a series of primes) that you wanted to
> > use to only read-in selected line numbers from a file (test file below
> > with numbers 1:1000 each on a separate line):
> >
> >> library(gmp)
> >> library(iterators)
> >> iprime <- iter(1:100, checkFunc = function(n) isprime(n))
> >> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
> > Read 1 item
> > [1] 2
> >> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
> > Read 1 item
> > [1] 3
> >> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
> > Read 1 item
> > [1] 5
> >> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
> > Read 1 item
> > [1] 7
> > However, what it really seems that you want to do is read each line of
> > a (possibly enormous) file, test each line "string-wise" to keep or
> > discard, and if you're keeping it, append the line to a list. I can
> > certainly see the advantage of this strategy for reading in very, very
> > large files, but it's not clear to me how the "ireadLines" function (
> > in the "iterators" package) will help you, since it doesn't seem to
> > generate anything but a sequential index.
> >
> > Anyway, below is an absolutely standard read-in of your data using
> > read.table(). Hopefully some of the code I've posted has been useful
> > to you.
> >
> >> sensors <-  c("N053", "N163")
> >> read.table("test2.txt")
> >  V1V2V3V4V5V6V7
> > V8V9   V10
> > 1 Time  0.00  0.000999  0.001999  0.002998  0.003998  0.004997
> > 0.005997  0.006996  0.007996
> > 2 N023 -0.031323 -0.035026 -0.029759 -0.024886 -0.024464 -0.026816
> > -0.033690 -0.041067 -0.038747
> > 3 N053 -0.014083 -0.004741  0.001443 -0.010152 -0.012996 -0.005337
> > -0.008738 -0.015094 -0.012104
> > 4 N123 -0.019008 -0.013494 -0.013180 -0.029208 -0.032748 -0.020243
> > -0.015089 -0.014439 -0.011681
> > 5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
> > -0.036061 -0.044516 -0.046436
> > 6 N193 -0.022171 -0.022384 -0.022338 -0.023304 -0.022569 -0.021827
> > -0.021996 -0.021755 -0.021846
> >> Laurent_data <- read.table("test2.txt")
> >> Laurent_data[Laurent_data$V1 %in% sensors, ]
> >  V1V2V3V4V5V6V7
> > V8V9   V10
> > 3 N053 -0.014083 -0.004741  0.001443 -0.010152 -0.012996 -0.005337
> > -0.008738 -0.015094 -0.012104
> > 5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
> > -0.036061 -0.044516 -0.046436
> >
> > Best, Bill.
> >
> > W. Michels, Ph.D.
> >
>

Re: [R] iterators : checkFunc with ireadLines

2020-05-18 Thread William Michels via R-help


Apologies, Laurent, for this two-part answer. I misunderstood your
post where you stated you wanted to "filter(ing) some
selected lines according to the line name... ." I thought that meant
you had a separate index (like a series of primes) that you wanted to
use to only read-in selected line numbers from a file (test file below
with numbers 1:1000 each on a separate line):

> library(gmp)
> library(iterators)
> iprime <- iter(1:100, checkFunc = function(n) isprime(n))
> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 2
> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 3
> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 5
> scan(file="one_thou_lines.txt", skip=nextElem(iprime)-1, nlines=1)
Read 1 item
[1] 7
>

However, what it really seems that you want to do is read each line of
a (possibly enormous) file, test each line "string-wise" to keep or
discard, and if you're keeping it, append the line to a list. I can
certainly see the advantage of this strategy for reading in very, very
large files, but it's not clear to me how the "ireadLines" function (
in the "iterators" package) will help you, since it doesn't seem to
generate anything but a sequential index.

Anyway, below is an absolutely standard read-in of your data using
read.table(). Hopefully some of the code I've posted has been useful
to you.

> sensors <-  c("N053", "N163")
> read.table("test2.txt")
V1V2V3V4V5V6V7
   V8V9   V10
1 Time  0.00  0.000999  0.001999  0.002998  0.003998  0.004997
0.005997  0.006996  0.007996
2 N023 -0.031323 -0.035026 -0.029759 -0.024886 -0.024464 -0.026816
-0.033690 -0.041067 -0.038747
3 N053 -0.014083 -0.004741  0.001443 -0.010152 -0.012996 -0.005337
-0.008738 -0.015094 -0.012104
4 N123 -0.019008 -0.013494 -0.013180 -0.029208 -0.032748 -0.020243
-0.015089 -0.014439 -0.011681
5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
-0.036061 -0.044516 -0.046436
6 N193 -0.022171 -0.022384 -0.022338 -0.023304 -0.022569 -0.021827
-0.021996 -0.021755 -0.021846
> Laurent_data <- read.table("test2.txt")
> Laurent_data[Laurent_data$V1 %in% sensors, ]
V1V2V3V4V5V6V7
   V8V9   V10
3 N053 -0.014083 -0.004741  0.001443 -0.010152 -0.012996 -0.005337
-0.008738 -0.015094 -0.012104
5 N163 -0.054023 -0.049345 -0.037158 -0.041120 -0.044612 -0.036953
-0.036061 -0.044516 -0.046436

Best, Bill.

W. Michels, Ph.D.


On Sun, May 17, 2020 at 5:43 PM Laurent Rhelp  wrote:
>
> Dear R-Help List,
>
> I would like to use an iterator to read a file filtering some
> selected lines according to the line name in order to use after a
> foreach loop. I wanted to use the checkFunc argument as the following
> example found on internet to select only prime numbers :
>
> |iprime <- ||iter||(1:100, checkFunc =
> ||function||(n) ||isprime||(n))|
>
> |(https://datawookie.netlify.app/blog/2013/11/iterators-in-r/)
> |
>
> but the checkFunc argument seems not to be available with the function
> ireadLines (package iterators). So, I did the code below to solve my
> problem but I am sure that I miss something to use iterators with files.
> Since I found nothing on the web about ireadLines and the checkFunc
> argument, could somebody help me to understand how we have to use
> iterator (and foreach loop) on files keeping only selected lines ?
>
> Thank you very much
> Laurent
>
> Presently here is my code:
>
> ##mock file to read: test.txt
> ##
> # Time00.0009990.0019990.0029980.003998 0.004997
> 0.0059970.0069960.007996
> # N023-0.031323-0.035026-0.029759-0.024886 -0.024464
> -0.026816-0.03369-0.041067-0.038747
> # N053-0.014083-0.0047410.001443-0.010152 -0.012996
> -0.005337-0.008738-0.015094-0.012104
> # N123-0.019008-0.013494-0.01318-0.029208 -0.032748
> -0.020243-0.015089-0.014439-0.011681
> # N163-0.054023-0.049345-0.037158-0.04112 -0.044612
> -0.036953-0.036061-0.044516-0.046436
> # N193-0.022171-0.022384-0.022338-0.023304 -0.022569
> -0.021827-0.021996-0.021755-0.021846
>
>
> # sensors to keep
>
> sensors <-  c("N053", "N163")
>
>
> library(iterators)
>
> library(rlist)
>
>
> file_name <- "test.txt"
>
> con_obj <- file( file_name , "r")
> ifile <- ireadLines( con_obj , n = 1 )
>
>
> ## I do not do a loop for the example
>
> res <- list()
>
> r <- get_Lines_iter( ifile , sensors)
> res <- list.append( res , r )
> res
> r <- get_Lines_iter( ifile , sensors)
> res <- list.append( res , r )
> res
> r <- get_Lines_iter( ifile , sensors)
> do.call("cbind",res)
>
> ## the function get_Lines_iter to select and process the line
>
> get_Lines_iter  <-  fun

Re: [R] iterators : checkFunc with ireadLines

2020-05-18 Thread William Michels via R-help


Dear Laurent,

I'm going through your code quickly, and the first question I have is
whether you loaded the "gmp" library?

> library(gmp)

Attaching package: ‘gmp’

The following objects are masked from ‘package:base’:

%*%, apply, crossprod, matrix, tcrossprod

> library(iterators)
> iter(1:100, checkFunc = function(n) isprime(n))
$state


$length
[1] 100

$checkFunc
function (n)
isprime(n)

$recycle
[1] FALSE

attr(,"class")
[1] "containeriter" "iter"
>

HTH, Bill.

W. Michels, Ph.D.



On Sun, May 17, 2020 at 5:43 PM Laurent Rhelp  wrote:
>
> Dear R-Help List,
>
> I would like to use an iterator to read a file filtering some
> selected lines according to the line name in order to use after a
> foreach loop. I wanted to use the checkFunc argument as the following
> example found on internet to select only prime numbers :
>
> |iprime <- ||iter||(1:100, checkFunc =
> ||function||(n) ||isprime||(n))|
>
> |(https://datawookie.netlify.app/blog/2013/11/iterators-in-r/)
> |
>
> but the checkFunc argument seems not to be available with the function
> ireadLines (package iterators). So, I did the code below to solve my
> problem but I am sure that I miss something to use iterators with files.
> Since I found nothing on the web about ireadLines and the checkFunc
> argument, could somebody help me to understand how we have to use
> iterator (and foreach loop) on files keeping only selected lines ?
>
> Thank you very much
> Laurent
>
> Presently here is my code:
>
> ##mock file to read: test.txt
> ##
> # Time00.0009990.0019990.0029980.003998 0.004997
> 0.0059970.0069960.007996
> # N023-0.031323-0.035026-0.029759-0.024886 -0.024464
> -0.026816-0.03369-0.041067-0.038747
> # N053-0.014083-0.0047410.001443-0.010152 -0.012996
> -0.005337-0.008738-0.015094-0.012104
> # N123-0.019008-0.013494-0.01318-0.029208 -0.032748
> -0.020243-0.015089-0.014439-0.011681
> # N163-0.054023-0.049345-0.037158-0.04112 -0.044612
> -0.036953-0.036061-0.044516-0.046436
> # N193-0.022171-0.022384-0.022338-0.023304 -0.022569
> -0.021827-0.021996-0.021755-0.021846
>
>
> # sensors to keep
>
> sensors <-  c("N053", "N163")
>
>
> library(iterators)
>
> library(rlist)
>
>
> file_name <- "test.txt"
>
> con_obj <- file( file_name , "r")
> ifile <- ireadLines( con_obj , n = 1 )
>
>
> ## I do not do a loop for the example
>
> res <- list()
>
> r <- get_Lines_iter( ifile , sensors)
> res <- list.append( res , r )
> res
> r <- get_Lines_iter( ifile , sensors)
> res <- list.append( res , r )
> res
> r <- get_Lines_iter( ifile , sensors)
> do.call("cbind",res)
>
> ## the function get_Lines_iter to select and process the line
>
> get_Lines_iter  <-  function( iter , sensors, sep = '\t', quiet = FALSE){
>## read the next record in the iterator
>r = try( nextElem(iter) )
>   while(  TRUE ){
>  if( class(r) == "try-error") {
>return( stop("The iterator is empty") )
> } else {
> ## split the read line according to the separator
>  r_txt <- textConnection(r)
>  fields <- scan(file = r_txt, what = "character", sep = sep, quiet =
> quiet)
>   ## test if we have to keep the line
>   if( fields[1] %in% sensors){
> ## data processing for the selected line (for the example
> transformation in dataframe)
> n <- length(fields)
> x <- data.frame( as.numeric(fields[2:n]) )
> names(x) <- fields[1]
> ## We return the values
> print(paste0("sensor ",fields[1]," ok"))
> return( x )
>   }else{
>print(paste0("Sensor ", fields[1] ," not selected"))
>r = try(nextElem(iter) )}
> }
> }# end while loop
> }
>
>
>
>
>
>
>
> --
> L'absence de virus dans ce courrier électronique a été vérifiée par le 
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iterators : checkFunc with ireadLines

2020-05-23 Thread William Michels via R-help
Hi Laurent,

Seeking to give you an "R-only" solution, I thought the read.fwf()
function might be useful (to read-in your first column of data, only).
However Jeff is correct that this is a poor strategy, since read.fwf()
reads the entire file into R (documented in "Fixed-width-format
files", Section 2.2: R Data Import/Export Manual).

Jeff has suggested a number of packages, as well as using a database.
Ivan Krylov has posted answers using grep, awk and perl (perl5--to
disambiguate). [In point of fact, the R Data Import/Export Manual
suggests using perl]. Similar to Ivan, I've posted code below using
the Raku programming language (the language formerly known as Perl6).
Regexes are claimed to be more readable, but are currently very slow
in Raku. However on the plus side, the language is designed to handle
Unicode gracefully:

> # pipe() using raku-grep on Laurent's data (sep=mult whitespace):
> con_obj1 <- pipe(paste("raku -e '.put for lines.grep( / ^^N053 | ^^N163 /, :p 
> );' ", "Laurents.txt"), open="rt");
> p6_import_a <- scan(file=con_obj1, what=list("","","","","","","","","",""), 
> flush=TRUE, multi.line=FALSE, quiet=TRUE);
> close(con_obj1);
> as.data.frame(sapply(p6_import_a, t), stringsAsFactors=FALSE);
  V1   V2V3V4V5V6V7V8
  V9   V10
1  2 N053 -0.014083 -0.004741  0.001443 -0.010152 -0.012996 -0.005337
-0.008738 -0.015094
2  4 N163 -0.054023 -0.049345 -0.037158  -0.04112 -0.044612 -0.036953
-0.036061 -0.044516
>
> # pipe() using raku-grep "starts-with" to find genbankID ( >3GB TSV file)
> # "lines[0..5]" restricts raku to reading first 6 lines!
> # change "lines[0..5]" to "lines" to run raku code on whole file:
> con_obj2 <- pipe(paste("raku -e '.put for lines[0..5].grep( 
> *.starts-with(q[A00145]), :p);' ", "genbankIDs_3GB.tsv"), "rt");
> p6_import_b <- read.table(con_obj2, sep="\t");
> close(con_obj2)
> p6_import_b
  V1 V2   V3  V4 V5
1  4 A00145 A00145.1 IFN-alpha A NA
>
> # unicode test using R's system() function:
> try(system("raku -ne '.grep( /  你好  |  こんにちは  |  مرحبا  |  Привет  /, :v 
> ).put;'  hello_7lang.txt", intern = TRUE, ignore.stderr = FALSE))
[1] """"""
"你好 Chinese"
[5] "こんにちは Japanese" "مرحبا Arabic""Привет Russian"
>

[special thanks to Brad Gilbert, Joseph Brenner and others on the
perl6-users mailing list. All errors above are my own.]

HTH, Bill.

W. Michels, Ph.D.




On Fri, May 22, 2020 at 4:48 AM Laurent Rhelp  wrote:
>
> Hi Ivan,
>Endeed, it is a good idea. I am under MSwindows but I can use the
> bash command I use with git. I will see how to do that with the unix
> command lines.
>
>
> Le 20/05/2020 à 09:46, Ivan Krylov a écrit :
> > Hi Laurent,
> >
> > I am not saying this will work every time and I do recognise that this
> > is very different from a more general solution that you had envisioned,
> > but if you are on an UNIX-like system or have the relevant utilities
> > installed and on the %PATH% on Windows, you can filter the input file
> > line-by-line using a pipe and an external program:
> >
> > On Sun, 17 May 2020 15:52:30 +0200
> > Laurent Rhelp  wrote:
> >
> >> # sensors to keep
> >> sensors <-  c("N053", "N163")
> > # filter on the beginning of the line
> > i <- pipe("grep -E '^(N053|N163)' test.txt")
> > # or:
> > # filter on the beginning of the given column
> > # (use $2 for the second column, etc.)
> > i <- pipe("awk '($1 ~ \"^(N053|N163)\")' test.txt")
> > # or:
> > # since your message is full of Unicode non-breaking spaces, I have to
> > # bring in heavier machinery to handle those correctly;
> > # only this solution manages to match full column values
> > # (here you can also use $F[1] for second column and so on)
> > i <- pipe("perl -CSD -F'\\s+' -lE \\
> >   'print join qq{\\t}, @F if $F[0] =~ /^(N053|N163)$/' \\
> >   test.txt
> > ")
> > lines <- read.table(i) # closes i when done
> >
> > The downside of this approach is having to shell-escape the command
> > lines, which can become complicated, and choosing between use of regular
> > expressions and more wordy programs (Unicode whitespace in the input
> > doesn't help, either).
> >
>
>
> --
> L'absence de virus dans ce courrier électronique a été vérifiée par le 
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iterators : checkFunc with ireadLines

2020-05-24 Thread William Michels via R-help
Strike that one sentence in brackets: "[In point of fact, the R Data
Import/Export Manual suggests using perl]", to pre-process data before
loading into R. The manual's recommendation only pertains to large
fixed width formatted files [see #1], whereas Laurent's data is
whitespace-delimited:

> read.table( "Laurents.txt")
> read.delim( "Laurents.txt", sep="")

Best Regards, Bill.

W. Michels, Ph.D.

Citation:
[#1] 
https://cran.r-project.org/doc/manuals/r-release/R-data.html#Fixed_002dwidth_002dformat-files

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iterators : checkFunc with ireadLines

2020-05-27 Thread William Michels via R-help
Hi Laurent,

Off the bat I would have guessed that the problem you're seeing has to
do with 'command line quoting' differences between the Windows system
and the Linux/Mac systems. I've noticed people using Windows having
better command line success with "exterior double-quotes / interior
single-quotes" while Linux/Mac tend to have more success with
"exterior single- quotes / interior double-quotes". The problem is
exacerbated in R by system() or pipe() calls which require another
(exterior) set of quotations.

1. You can print out your connection object to make sure that the
interior code was read properly into R. Also, take a look at the
'connections' help page to see if there are other parameters you need
to explicitly set (like encoding). Here's the first (working) example
from my last post to you:

> ?connections
> con_obj1
  description
"raku -e '.put for lines.grep( / ^^N053 | ^^N163 /, :p );'  Laurents.txt"
class
   "pipe"
 mode
 "rt"
 text
   "text"
   opened
 "opened"
 can read
"yes"
can write
 "no"
>

2. You can try 'backslash-escaping' interior quotes in your system()
or pipe() calls. Also, in two of my previous examples I use paste() to
break up complicated quoting into more manageable chunks. You can try
these calls with 'backslash-escaped' interior quotes, and without
paste():

> con_obj1 <- pipe("raku -e \'.put for lines.grep( / ^^N053 | ^^N163 /, :p );\' 
> Laurents.txt", open="rt");
> con_obj1
 description
"raku -e '.put for lines.grep( / ^^N053 | ^^N163 /, :p );' Laurents.txt"
   class
  "pipe"
mode
"rt"
text
  "text"
  opened
"opened"
can read
   "yes"
   can write
"no"
>

3. If R creates your 'con_obj' without throwing an error, then you
should try the most basic functions for reading data into R, something
like readLines(). Again, recreate our 'con_obj' with different
encodings, if necessary. Be careful of reading from the same
connection object with multiple R functions (an unlikely scenario, but
one that should be mentioned). Below it appears that 'con_obj1' gets
consumed by readLines() before the second call to scan():

> rm(con_obj1)
> # note: dropped ':p' adverb below to simplify
> con_obj1 <- pipe("raku -e \'.put for lines.grep( / ^^N053 | ^^N163 / );\' 
> Laurents.txt", open="rt");
> scan(con_obj1)
Error in scan(con_obj1) : scan() expected 'a real', got 'N053'
> con_obj1 <- pipe("raku -e \'.put for lines.grep( / ^^N053 | ^^N163 / );\' 
> Laurents.txt", open="rt");
> readLines(con_obj1)
[1] "N053-0.014083-0.0047410.001443-0.010152 -0.012996
   -0.005337-0.008738-0.015094-0.012104"
[2] "N163-0.054023-0.049345-0.037158-0.04112 -0.044612
   -0.036953-0.036061-0.044516-0.046436"
> scan(con_obj1)
Read 0 items
numeric(0)

>

Other than that, you can post here again and we'll try to help. If you
become convinced it's a raku problem, you can check the 'raku-grep'
help page at https://docs.raku.org/routine/grep, or post a question to
the perl6-users mailing list at perl6-us...@perl.org .

HTH, Bill.

W. Michels, Ph.D.
On Wed, May 27, 2020 at 1:56 AM Laurent Rhelp  wrote:
>
> I installed raku on my PC to test your solution:
>
> The command raku -e '.put for lines.grep( / ^^N053 | ^^N163 /, :p );'
> Laurents.txt works fine when I write it in the bash command but when I
> use 

Re: [R] how to filter variables which appear in any row but do not include

2020-06-03 Thread William Michels via R-help
#Below returns long list of TRUE/FALSE values,
#Note: "IDs" is a column name,
#Wrap with head() to shorten:
df$IDs %in% c("ident_1", "ident_2");

#Below returns index of IDs that are TRUE,
#Wrap with head() to shorten:
which(df$IDs %in% c("ident_1", "ident_2"));

#Below returns short TRUE/FALSE table:
table(df$IDs %in% c("ident_1", "ident_2"));

#Below check df to see unique IDs returned by "%in%" code above,
#(Good for identifying missing "desired" IDs):
unique(df[df$IDs %in% c("ident_1", "ident_2"), "IDs"]);

#Below returns dimensions of dataframe "filtered" (retained) by desired IDs,
#(Note rows below should equal number of TRUE in table above):
dim(df[df$IDs %in% c("ident_1", "ident_2"), ]);

#Create filtered dataframe object:
df_filtered  <-  df[df$IDs %in% c("ident_1", "ident_2"),  ];

#Below returns row counts per "IDs" ("ident_1", "ident_2", etc.) in df_filtered:
aggregate(df_filtered$IDs, by=list(df_filtered$IDs), FUN = "length");


HTH, Bill.

W. Michels, Ph.D.





On Wed, Jun 3, 2020 at 7:56 AM Ana Marija  wrote:
>
> Hello.
>
> I am trying to filter only rows that have ANY of these variables:
> E109, E119, E149
>
> so I did:
> controls=t %>% filter_all(any_vars(. %in% c("E109", "E119","E149")))
>
> than I checked what I got:
> > s0 <- sapply(controls, function(x) grep('^E10', x, value = TRUE))
> > d0=unlist(s0)
> > d10=unique(d0)
> > d10
>  [1] "E10"  "E103" "E104" "E109" "E101" "E108" "E105" "E100" "E106" "E102"
> [11] "E107"
> s1 <- sapply(controls, function(x) grep('^E11', x, value = TRUE))
> d1=unlist(s1)
> d11=unique(d1)
> > d11
>  [1] "E11"  "E119" "E113" "E115" "E111" "E114" "E110" "E118" "E116" "E112"
> [11] "E117"
>
> I need help with changing this command
> controls=t %>% filter_all(any_vars(. %in% c("E109", "E119","E149")))
>
> so that in the output I do not have any rows that include E102 or E112?
>
> Thanks
> Ana
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Compied for Mac OS X

2020-06-25 Thread William Michels via R-help
Hi, you can try starting at the link below:

https://stat.ethz.ch/R-manual/R-patched/doc/html/packages.html

Or type any of following commands into your R-Console (for starters):

> library()
> library(help="base")
> library(help="stats")
> library(help="graphics")
> library(help="grDevices")
> library(help="utils")
> library(help="datasets")
> library(help="methods")
>

HTH, Bill.

W. Michels, Ph.D.

On Thu, Jun 25, 2020 at 4:09 PM Gregory Coats via R-help
 wrote:
>
> Today, I downloaded, and installed the June 6, 2020 version of R, from the 
> CRAN official site at Carnegie Mellon University. Unfortunately, while the 
> CMU compiled Mac OS X R application provides access to base R stat functions, 
> like mean, it does not provide me with access to any of R’s more advanced 
> functions like movavg, and ggplot.
> From where can I download a more complete R executable, compiled for Mac OS X?
> Greg Coats
>
> http://lib.stat.cmu.edu/R/CRAN/ 
> Download R for (Mac) OS X
> R-4.0.2.pkg (notarized and signed)
> > version
> platform   x86_64-apple-darwin17.0
> arch   x86_64
> os darwin17.0
> system x86_64, darwin17.0
> status
> major  4
> minor  0.1
> year   2020
> month  06
> day06
> svn rev78648
> language   R
> version.string R version 4.0.1 (2020-06-06)
> nickname   See Things Now
> > mean
> function (x, ...)
> UseMethod("mean")
> 
> 
> > movavg
> Error: object 'movavg' not found
> > ggplot
> Error: object 'ggplot' not found
> >
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character (1a, 1b) to numeric

2020-07-10 Thread William Michels via R-help
Hello Jean-Louis,

Noting the subject line of your post I thought the first answer would
have been encoding histology stages as factors, and "unclass-ing" them
to obtain integers that then can be mathematically manipulated. You
can get a lot of work done with all the commands listed on the
"factor" help page:

?factor
samples <- 1:36
values <- runif(length(samples), min=1, max=length(samples))
hist <- rep(c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c"), times=1:8)
data1 <- data.frame("samples" = samples, "values" = values, "hist" = hist )
(data1$hist <- factor(data1$hist, levels=c("1", "1a", "1b", "1c", "2",
"2a", "2b", "2c")) )
unclass(data1$hist)

library(RColorBrewer); pal_1 <- brewer.pal(8, "Pastel2")
barplot(data1$value, beside=T, col=pal_1[data1$hist])
plot(data1$hist, data1$value, col=pal_1)
pal_2 <- brewer.pal(8, "Dark2")
plot(unclass(data1$hist)/4, data1$value, pch=19, col=pal_2[data1$hist] )
group <- c(rep(0,10),rep(1,26)); data1$group <- group
library(lattice); dotplot(hist ~ values | group, data=data1, xlim=c(0,36) )

HTH, Bill.

W. Michels, Ph.D.




On Fri, Jul 10, 2020 at 1:41 PM Jean-Louis Abitbol  wrote:
>
> Many thanks to all. This help-list is wonderful.
>
> I have used Rich Heiberger solution using match and found something to learn 
> in each answer.
>
> off topic, I also enjoyed very much his 2008 paper on the graphical 
> presentation of safety data
>
> Best wishes.
>
>
> On Fri, Jul 10, 2020, at 10:02 PM, Fox, John wrote:
> > Hi,
> >
> > We've had several solutions, and I was curious about their relative
> > efficiency. Here's a test with a moderately large data vector:
> >
> > > library("microbenchmark")
> > > set.seed(123) # for reproducibility
> > > x <- sample(xc, 1e4, replace=TRUE) # "data"
> > > microbenchmark(John = John <- xn[x],
> > +Rich = Rich <- xn[match(x, xc)],
> > +Jeff = Jeff <- {
> > + n <- as.integer( sub( "[a-i]$", "", x ) )
> > + d <- match( sub( "^\\d+", "", x ), letters[1:9] )
> > + d[ is.na( d ) ] <- 0
> > + n + d / 10
> > + },
> > +David = David <- as.numeric(gsub("a", ".3",
> > +  gsub("b", ".5",
> > +   gsub("c", ".7", x,
> > +times=1000L
> > +)
> > Unit: microseconds
> >   expr   minlq   mean median uq   max neval 
> > cld
> >   John   228.816   345.371   513.5614   503.5965   533.0635  10829.08  1000 
> > a
> >   Rich   217.395   343.035   534.2074   489.0075   518.3260  15388.96  1000 
> > a
> >   Jeff 10325.471 13070.737 15387.2545 15397.9790 17204.0115 153486.94  1000 
> >  b
> >  David 14256.673 18148.492 20185.7156 20170.3635 22067.6690  34998.95  1000 
> >   c
> > > all.equal(John, Rich)
> > [1] TRUE
> > > all.equal(John, David)
> > [1] "names for target but not for current"
> > > all.equal(John, Jeff)
> > [1] "names for target but not for current" "Mean relative difference:
> > 0.1498243"
> >
> > Of course, efficiency isn't the only consideration, and aesthetically
> > (and no doubt subjectively) I prefer Rich Heiberger's solution. OTOH,
> > Jeff's solution is more general in that it generates the correspondence
> > between letters and numbers. The argument for Jeff's solution would,
> > however, be stronger if it gave the desired answer.
> >
> > Best,
> >  John
> >
> > > On Jul 10, 2020, at 3:28 PM, David Carlson  wrote:
> > >
> > > Here is a different approach:
> > >
> > > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > > xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", xc
> > > xn
> > > # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> > >
> > > David L Carlson
> > > Professor Emeritus of Anthropology
> > > Texas A&M University
> > >
> > > On Fri, Jul 10, 2020 at 1:10 PM Fox, John  wrote:
> > > Dear Jean-Louis,
> > >
> > > There must be many ways to do this. Here's one simple way (with no claim 
> > > of optimality!):
> > >
> > > > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > > >
> > > > set.seed(123) # for reproducibility
> > > > x <- sample(xc, 20, replace=TRUE) # "data"
> > > >
> > > > names(xn) <- xc
> > > > z <- xn[x]
> > > >
> > > > data.frame(z, x)
> > >  z  x
> > > 1  2.5 2b
> > > 2  2.5 2b
> > > 3  1.5 1b
> > > 4  2.3 2a
> > > 5  1.5 1b
> > > 6  1.3 1a
> > > 7  1.3 1a
> > > 8  2.3 2a
> > > 9  1.5 1b
> > > 10 2.0  2
> > > 11 1.7 1c
> > > 12 2.3 2a
> > > 13 2.3 2a
> > > 14 1.0  1
> > > 15 1.3 1a
> > > 16 1.5 1b
> > > 17 2.7 2c
> > > 18 2.0  2
> > > 19 1.5 1b
> > > 20 1.5 1b
> > >
> > > I hope this helps,
> > >  John
> > >
> > >   -
> > >   John Fox, Professor Emeritus
> > >   McMaster University
> > >   Hamilton, Ontario, Canada
> > >   Web: http::/socserv.mcmaster.ca/jfox
> > >
> > > > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol  
> > > > wrote:

Re: [R] Character (1a, 1b) to numeric

2020-07-11 Thread William Michels via R-help
Agreed, I meant to add this line (for unclassed factor levels 1-through-8):

> ((1:8 - 1)*(0.25))+1
[1] 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75

Depending on the circumstance, you can also consider using dummy
factors or even "NA" as a level; see the "factor" help page for
details.

Best, Bill.

W. Michels, Ph.D.



On Sat, Jul 11, 2020 at 12:16 AM Jean-Louis Abitbol  wrote:
>
> Hello Bill,
>
> Thanks.
>
> That has indeed the advantage of keeping the histology classification on the  
> plot instead of some arbitrary numeric scale.
>
> Best wishes, JL
>
> On Sat, Jul 11, 2020, at 8:25 AM, William Michels wrote:
> > Hello Jean-Louis,
> >
> > Noting the subject line of your post I thought the first answer would
> > have been encoding histology stages as factors, and "unclass-ing" them
> > to obtain integers that then can be mathematically manipulated. You
> > can get a lot of work done with all the commands listed on the
> > "factor" help page:
> >
> > ?factor
> > samples <- 1:36
> > values <- runif(length(samples), min=1, max=length(samples))
> > hist <- rep(c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c"), times=1:8)
> > data1 <- data.frame("samples" = samples, "values" = values, "hist" = hist )
> > (data1$hist <- factor(data1$hist, levels=c("1", "1a", "1b", "1c", "2",
> > "2a", "2b", "2c")) )
> > unclass(data1$hist)
> >
> > library(RColorBrewer); pal_1 <- brewer.pal(8, "Pastel2")
> > barplot(data1$value, beside=T, col=pal_1[data1$hist])
> > plot(data1$hist, data1$value, col=pal_1)
> > pal_2 <- brewer.pal(8, "Dark2")
> > plot(unclass(data1$hist)/4, data1$value, pch=19, col=pal_2[data1$hist] )
> > group <- c(rep(0,10),rep(1,26)); data1$group <- group
> > library(lattice); dotplot(hist ~ values | group, data=data1, xlim=c(0,36) )
> >
> > HTH, Bill.
> >
> > W. Michels, Ph.D.
> >
> >
> >
> >
> > On Fri, Jul 10, 2020 at 1:41 PM Jean-Louis Abitbol  wrote:
> > >
> > > Many thanks to all. This help-list is wonderful.
> > >
> > > I have used Rich Heiberger solution using match and found something to 
> > > learn in each answer.
> > >
> > > off topic, I also enjoyed very much his 2008 paper on the graphical 
> > > presentation of safety data
> > >
> > > Best wishes.
> > >
> > >
> > > On Fri, Jul 10, 2020, at 10:02 PM, Fox, John wrote:
> > > > Hi,
> > > >
> > > > We've had several solutions, and I was curious about their relative
> > > > efficiency. Here's a test with a moderately large data vector:
> > > >
> > > > > library("microbenchmark")
> > > > > set.seed(123) # for reproducibility
> > > > > x <- sample(xc, 1e4, replace=TRUE) # "data"
> > > > > microbenchmark(John = John <- xn[x],
> > > > +Rich = Rich <- xn[match(x, xc)],
> > > > +Jeff = Jeff <- {
> > > > + n <- as.integer( sub( "[a-i]$", "", x ) )
> > > > + d <- match( sub( "^\\d+", "", x ), letters[1:9] )
> > > > + d[ is.na( d ) ] <- 0
> > > > + n + d / 10
> > > > + },
> > > > +David = David <- as.numeric(gsub("a", ".3",
> > > > +  gsub("b", ".5",
> > > > +   gsub("c", ".7", x,
> > > > +times=1000L
> > > > +)
> > > > Unit: microseconds
> > > >   expr   minlq   mean median uq   max 
> > > > neval cld
> > > >   John   228.816   345.371   513.5614   503.5965   533.0635  10829.08  
> > > > 1000 a
> > > >   Rich   217.395   343.035   534.2074   489.0075   518.3260  15388.96  
> > > > 1000 a
> > > >   Jeff 10325.471 13070.737 15387.2545 15397.9790 17204.0115 153486.94  
> > > > 1000  b
> > > >  David 14256.673 18148.492 20185.7156 20170.3635 22067.6690  34998.95  
> > > > 1000   c
> > > > > all.equal(John, Rich)
> > > > [1] TRUE
> > > > > all.equal(John, David)
> > > > [1] "names for target but not for current"
> > > > > all.equal(John, Jeff)
> > > > [1] "names for target but not for current" "Mean relative difference:
> > > > 0.1498243"
> > > >
> > > > Of course, efficiency isn't the only consideration, and aesthetically
> > > > (and no doubt subjectively) I prefer Rich Heiberger's solution. OTOH,
> > > > Jeff's solution is more general in that it generates the correspondence
> > > > between letters and numbers. The argument for Jeff's solution would,
> > > > however, be stronger if it gave the desired answer.
> > > >
> > > > Best,
> > > >  John
> > > >
> > > > > On Jul 10, 2020, at 3:28 PM, David Carlson  wrote:
> > > > >
> > > > > Here is a different approach:
> > > > >
> > > > > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > > > > xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", 
> > > > > xc
> > > > > xn
> > > > > # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> > > > >
> > > > > David L Carlson
> > > > > Professor Emeritus of Anthropology
> > > > > Texas A&M University
> > > > >
> > > > > On Fri, Jul 10, 2020 at 1:10 PM Fox, John  wrote:
> > > > > Dear Jean-Louis,
> >

Re: [R] Help with read.csv.sql()

2020-07-18 Thread William Michels via R-help
Do either of the postings/threads below help?

https://r.789695.n4.nabble.com/read-csv-sql-to-select-from-a-large-csv-file-td4650565.html#a4651534
https://r.789695.n4.nabble.com/using-sqldf-s-read-csv-sql-to-read-a-file-with-quot-NA-quot-for-missing-td4642327.html

Otherwise you can try reading through the FAQ on Github:

https://github.com/ggrothendieck/sqldf

HTH, Bill.

W. Michels, Ph.D.



On Sat, Jul 18, 2020 at 9:59 AM H  wrote:
>
> On 07/18/2020 11:54 AM, Rui Barradas wrote:
> > Hello,
> >
> > I don't believe that what you are asking for is possible but like Bert 
> > suggested, you can do it after reading in the data.
> > You could write a convenience function to read the data, then change what 
> > you need to change.
> > Then the function would return this final object.
> >
> > Rui Barradas
> >
> > Às 16:43 de 18/07/2020, H escreveu:
> >
> >> On 07/17/2020 09:49 PM, Bert Gunter wrote:
> >>> Is there some reason that you can't make the changes to the data frame 
> >>> (column names, as.date(), ...) *after* you have read all your data in?
> >>>
> >>> Do all your csv files use the same names and date formats?
> >>>
> >>>
> >>> Bert Gunter
> >>>
> >>> "The trouble with having an open mind is that people keep coming along 
> >>> and sticking things into it."
> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 6:28 PM H  >>> > wrote:
> >>>
> >>>  I have created a dataframe with columns that are characters, 
> >>> integers and numeric and with column names assigned by me. I am using 
> >>> read.csv.sql() to read portions of a number of large csv files into this 
> >>> dataframe, each csv file having a header row with columb names.
> >>>
> >>>  The problem I am having is that the csv files have header rows with 
> >>> column names that are slightly different from the column names I have 
> >>> assigned in the dataframe and it seems that when I read the csv data into 
> >>> the dataframe, the column names from the csv file replace the column 
> >>> names I chose when creating the dataframe.
> >>>
> >>>  I have been unable to figure out if it is possible to assign column 
> >>> names of my choosing in the read.csv.sql() function? I have tried various 
> >>> variations but none seem to work. I tried colClasses = c() but that 
> >>> did not work, I tried field.types = c(...) but could not get that to work 
> >>> either.
> >>>
> >>>  It seems that the above should be feasible but I am missing 
> >>> something? Does anyone know?
> >>>
> >>>  A secondary issue is that the csv files have a column with a date in 
> >>> mm/dd/ format that I would like to make into a Date type column in my 
> >>> dataframe. Again, I have been unable to find a way - if at all possible - 
> >>> to force a conversion into a Date format when importing into the 
> >>> dataframe. The best I have so far is to import is a character column and 
> >>> then use as.Date() to later force the conversion of the dataframe column.
> >>>
> >>>  Is it possible to do this when importing using read.csv.sql()?
> >>>
> >>>  __
> >>>  R-help@r-project.org  mailing list -- 
> >>> To UNSUBSCRIBE and more, see
> >>>  https://stat.ethz.ch/mailman/listinfo/r-help
> >>>  PLEASE do read the posting guide 
> >>> http://www.R-project.org/posting-guide.html
> >>>  and provide commented, minimal, self-contained, reproducible code.
> >>>
> >> Yes, the files use the same column names and date format (at least as far 
> >> as I know now.) I agree I could do it as you suggest above but from a 
> >> purist perspective I would rather do it when importing the data using 
> >> read.csv.sql(), particularly if column names and/or date format might 
> >> change, or be different between different files. I am indeed selecting 
> >> rows from a large number of csv files so this is entirely plausible.
> >>
> >> Has anyone been able to name columns in the read.csv.sql() call and/or 
> >> force date format conversion in the call itself? The first refers to 
> >> naming columns differently from what a header in the csv file may have.
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> The documentation for read.csv.sql() suggests that colClasses() and/or 
> field.types() should work but I may well have misunderstood the 
> documentation, hence my question in this group.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-

Re: [R] help with web scraping

2020-07-23 Thread William Michels via R-help
Hi Spencer,

I tried the code below on an older R-installation, and it works fine.
Not a full solution, but it's a start:

> library(RCurl)
Loading required package: bitops
> url <- 
> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975";
> M_sos <- getURL(url)
> print(M_sos)
[1] "\r\n\r\n\r\n\r\n\r\n\tSOS, Missouri - Elections:
Offices Filed in Candidate Filing\r\n wrote:
>
> Hello, All:
>
>
>I've failed with multiple attempts to scrape the table of
> candidates from the website of the Missouri Secretary of State:
>
>
> https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975
>
>
>I've tried base::url, base::readLines, xml2::read_html, and
> XML::readHTMLTable; see summary below.
>
>
>Suggestions?
>Thanks,
>Spencer Graves
>
>
> sosURL <-
> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975";
>
> str(baseURL <- base::url(sosURL))
> # this might give me something, but I don't know what
>
> sosRead <- base::readLines(sosURL) # 404 Not Found
> sosRb <- base::readLines(baseURL) # 404 Not Found
>
> sosXml2 <- xml2::read_html(sosURL) # HTTP error 404.
>
> sosXML <- XML::readHTMLTable(sosURL)
> # List of 0;  does not seem to be XML
>
> sessionInfo()
>
> R version 4.0.2 (2020-06-22)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Catalina 10.15.5
>
> Matrix products: default
> BLAS:
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK:
> /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets
> [6] methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_4.0.2 tools_4.0.2curl_4.3
> [4] xml2_1.3.2 XML_3.99-0.3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] Re: help with web scraping

2020-07-25 Thread William Michels via R-help
Dear Spencer Graves (and Rasmus Liland),

I've had some luck just using gsub() to alter the offending ""
characters, appending a "___" tag at each instance of "" (first I
checked the text to make sure it didn't contain any pre-existing
instances of "___"). See the output snippet below:

> library(RCurl)
> library(XML)
> sosURL <- 
> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975";
> sosChars <- getURL(sosURL)
> sosChars2 <- gsub("", "___", sosChars)
> MOcan <- readHTMLTable(sosChars2)
> MOcan[[2]]
  Name  Mailing Address Random
Number Date Filed
1   Raleigh Ritter  4476 FIVE MILE RD___SENECA MO 64865
   185  2/25/2020
2  Mike Parson 1458 E 464 RD___BOLIVAR MO 65613
   348  2/25/2020
3 James W. (Jim) NeelyPO BOX 343___CAMERON MO 64429
   477  2/25/2020
4 Saundra McDowell 3854 SOUTH AVENUE___SPRINGFIELD MO 65807
3/31/2020
>

It's true, there's one a 'section' of MOcan output that contains
odd-looking characters (see the "Total" line of MOcan[[1]]). But my
guess is you'll be deleting this 'line' anyway--and recalulating totals in R.

Now that you have a comprehensive list object, you should be able to
pull out districts/races of interest. You might want to take a look at
the "rlist" package, to see if it can make your work a little easier:

https://CRAN.R-project.org/package=rlist
https://renkun-ken.github.io/rlist-tutorial/index.html

HTH, Bill.

W. Michels, Ph.D.









On Sat, Jul 25, 2020 at 7:56 AM Spencer Graves
 wrote:
>
> Dear Rasmus et al.:
>
>
> On 2020-07-25 04:10, Rasmus Liland wrote:
> > On 2020-07-24 10:28 -0500, Spencer Graves wrote:
> >> Dear Rasmus:
> >>
> >>> Dear Spencer,
> >>>
> >>> I unified the party tables after the
> >>> first summary table like this:
> >>>
> >>> url <- 
> >>> "https://s1.sos.mo.gov/CandidatesOnWeb/DisplayCandidatesPlacement.aspx?ElectionCode=750004975";
> >>> M_sos <- RCurl::getURL(url)
> >>> saveRDS(object=M_sos, file="dcp.rds")
> >>> dat <- XML::readHTMLTable(M_sos)
> >>> idx <- 2:length(dat)
> >>> cn <- unique(unlist(lapply(dat[idx], colnames)))
> >> This is useful for this application.
> >>
> >>> dat <- do.call(rbind,
> >>>   sapply(idx, function(i, dat, cn) {
> >>> x <- dat[[i]]
> >>> x[,cn[!(cn %in% colnames(x))]] <- NA
> >>> x <- x[,cn]
> >>> x$Party <- names(dat)[i]
> >>> return(list(x))
> >>>   }, dat=dat, cn=cn))
> >>> dat[,"Date Filed"] <-
> >>>   as.Date(x=dat[,"Date Filed"],
> >>>   format="%m/%d/%Y")
> >> This misses something extremely
> >> important for this application:?  The
> >> political office.? That's buried in
> >> the HTML or whatever it is.? I'm using
> >> something like the following to find
> >> that:
> >>
> >> str(LtGov <- gregexpr('Lieutenant Governor', M_sos)[[1]])
> > Dear Spencer,
> >
> > I came up with a solution, but it is not
> > very elegant.  Instead of showing you
> > the solution, hoping you understand
> > everything in it, I istead want to give
> > you some emphatic hints to see if you
> > can come up with a solution on you own.
> >
> > - XML::htmlTreeParse(M_sos)
> >- *Gandalf voice*: climb the tree
> >  until you find the content you are
> >  looking for flat out at the level of
> >  «The Children of the Div», *uuuUUU*
> >- you only want to keep the table and
> >  header tags at this level
> > - Use XML::xmlValue to extract the
> >values of all the headers (the
> >political positions)
> > - Observe that all the tables on the
> >page you were able to extract
> >previously using XML::readHTMLTable,
> >are at this level, shuffled between
> >the political position header tags,
> >this means you extract the political
> >position and party affiliation by
> >using a for loop, if statements,
> >typeof, names, and [] and [[]] to grab
> >different things from the list
> >(content or the bag itself).
> >XML::readHTMLTable strips away the
> >line break tags from the Mailing
> >address, so if you find a better way
> >of extracting the tables, tell me,
> >e.g. you get
> >
> >   8805 HUNTER AVEKANSAS CITY MO 64138
> >
> >and not
> >
> >   8805 HUNTER AVEKANSAS CITY MO 64138
> >
> > When you've completed this «programming
> > quest», you're back at the level of the
> > previous email, i.e.  you have have the
> > same tables, but with political position
> > and party affiliation added to them.
>
>
>Please excuse:  Before my last post, I had written code to do all
> that.  In brief, the political offices are "h3" tags.  I used "strsplit"
> to split the string at "".  I then wrote a function to find "",
> extract the political office and pass the rest to "XML::readHTMLTable",
> adding columns for party and political office.
>
>
>However, this suppressed "" everywhere.  I thought there
> should be an

Re: [R] PROBLEM: quickly downloading 10,000 articles to sift through

2020-08-30 Thread William Michels via R-help
Hello John, Does this help?

https://cran.r-project.org/web/packages/bibliometrix/vignettes/bibliometrix-vignette.html
https://bibliometrix.org/

Best, Bill.

W. Michels, Ph.D.


On Fri, Aug 28, 2020 at 11:04 PM Fraedrich, John  wrote:
>
>
>
> To analyze 10,000+ articles within several journals to determine major 
> theories used, empirical research of models, constructs, and variables, 
> differences in standard definitions by discipline, etc. Is/does R have this 
> in a software package?
>
>
>
> Sent from Mail for Windows 10
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.