Re: [R] legend order in ggplot2

2018-05-27 Thread Tom Hopper
John,

The order of legends in ggplot2 depends on the order of factor levels in the 
data frame. The linetype can be matched to the factor levels using a named 
vector (ggplot2 basically does a lookup).

The biggest problem you have here is that you’re not passing data in the right 
form or format to ggplot2. ggplot2 expects a data frame in tidy format (see 
http://r4ds.had.co.nz/tidy-data.html), and you’ve got wide (and pretty messy) 
data.

To get this to work, your data frame needs to look like:

xvar   val
1name_a   1
2name_a   2
1name_b   3
2name_b   4
.
.
.

To make this work, I’m going to load tidyverse instead of just ggplot2. We’ll 
start with your data, but we need to reshape it into tidy format using dplyr 
and tidyr, and then we need to order the factor levels with forcats.

library(tidyverse)
my_df <- data_frame(x = c(1, 2),
   name_a=c(1,2),
   name_b=c(3,4),
   name_c=c(5,6)) %>%
  gather(var, val, -x) %>%
  mutate(var = fct_relevel(var, "name_b", "name_a", “name_c”))

Here’s how the data looks:

> my_df
# A tibble: 6 x 3
  x var  val
 
1 1 name_a 1
2 2 name_a 2
3 1 name_b 3
4 2 name_b 4
5 1 name_c 5
6 2 name_c 6
> levels(my_df$var)
[1] "name_b" "name_a" “name_c"

Now we have the data in the right format, and your factors are in the right 
order for the legend. We can quick test this:

ggplot(my_df, aes(x = x, y = val, linetype = var)) +
  geom_line()

and I think this is pretty close to what you were looking for.

To control the linetype by factor level, we need to tell ggplot2 which linetype 
goes with which factor level by using scale_linetype_manual():

my_lines <- c(name_a = "solid", name_b = "dotted", name_c = "twodash")

ggplot(df, aes(x = x, y = val, linetype = var)) +
  geom_line() +
  scale_linetype_manual(values = my_lines)

Here, I’ve deliberately changed the linetypes so we can see that the code works 
as desired. You’ll want to change the linetypes in my_lines to what you want.

Putting it all together:

library(tidyverse)
my_df <- data_frame(x = c(1, 2),
   name_a=c(1,2),
   name_b=c(3,4),
   name_c=c(5,6)) %>%
  gather(var, val, -x) %>%
  mutate(var = fct_relevel(var, "name_b", "name_a", "name_c”))

my_lines <- c(name_a = "solid", name_b = "dotted", name_c = "twodash")

ggplot(my_df, aes(x = x, y = val, linetype = var)) +
  geom_line() +
  scale_linetype_manual(values = my_lines)

Regards,

Tom



> On 201805 21, at 23:16, John  wrote:
> 
> Hi,
> 
>   I'd like to graph three lines on ggplot2 and I intend the lines to be
> "solid", "dashed", and "dotted". The legend names are "name_b", "name_a",
> "name_c". I'd like to legend to present in the order: the "name_b" at the
> top, and "name_c" at the bottom.
> As a consequence, the legend is indeed in the order: name_b at the top and
> name_c at the bottom. However, I'd like name_b to corresponds to "solid",
> while it corresponds to "dashed", etc, which I don't want. How could I make
> "solid" correspond to "name_b"? Thanks,
> 
> #
> 
> library(ggplot2)
> df<-data.frame(x1=c(1,2), y1=c(3,4),z1=c(5,6),w1=c(7,8))
> p1<-ggplot(df, aes(x=1:2, y=x1))+
>  geom_line(aes(linetype="name_b"))+
>  geom_line(aes(x=1:2, y=y1, linetype="name_a"), df)+
>  geom_line(aes(x=1:2, y=z1, linetype="name_c"), df)+
>  scale_linetype_manual(name="", values=c("solid","dashed", "dotted"),
> breaks=c("name_b","name_a","name_c"))
> ###
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



signature.asc
Description: Message signed with OpenPGP
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Warning when running R - can't install packages either

2016-05-12 Thread Tom Hopper
setInternet2() first thing after launching R might fix that.


> On May 12, 2016, at 07:45, Alba Pompeo  wrote:
> 
> Hello.
> 
> I've tried to run R, but I receive many warnings and can't do simple
> stuff such as installing packages.
> 
> Here's the full log when I run it.
> 
> http://pastebin.com/raw/2BkNpTte
> 
> Does anyone know what could be wrong here?
> 
> Thanks a lot.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting moving range control chart

2010-01-13 Thread Tom Hopper
I have been having the same problem as poster Hodgess, below. It appears
that her question was never answered, so I would like to share a solution
with the community.

The problem is the (apparent?) inability to produce moving range process
behavior (a.k.a. "control") charts with individuals data in the package
"qcc" (v. 2.0). I have also struggled with the same limitation in package
"IQCC" (v. 1.0).

The package "qAnalyst" (v. 0.6.0) provides an option to produce a moving
range chart with individuals data. The example given in the qAnalyst manual
for function spc yields an individuals chart:

> #i-chart, moving range to estimate st. dev. is equal to 2 points with
testType=1,
> data(rawWeight)
> ichart=spc(x=rawWeight$rawWeight, sg=2, type="i", name="weight",
testType=1)
> plot(ichart)
> summary(ichart)

Changing "type = 'i'" to "type = 'mr'" yields the moving chart:

> mrchart = spc(x = rawWeight$rawWeight, sg = 2, type = "mr", name =
"weight", testType = 1)
> plot(mrchart)
> summary(mrchart)

In separate tests, I have confirmed that qAnalyst correctly computes natural
process limits (a.k.a. "control limits") for X-bar and R charts, using the
average of the subgroup means. I have not yet checked the calculations for
the ImR or other charts.

An additional difference between these packages is that qAnalyst uses the
lattice library to generate output, while the other two packages appear to
use the (traditional) graphics library.

Regards,

Tom


On Tue, 10 Nov 2009 23:39:23 -0600, Erin Hodgess
>
wrote:

> Dear R People:
>
> I am using qcc for a quality control class.
>
> I have used qcc with type "xbar.one" for individuals but cannot determine
> how to plot a moving range control chart.
>
> Has anyone done that, please?
>
> Thanks,
> Erin
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodgess_at_gmail.com
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modeest with non-numeric data?

2013-07-26 Thread Tom Hopper
Hello,

I have recently discovered the modeest library, and am trying to understand
how to use it with non-numeric data (e.g. determining the most common last
name, or analysing customer demographics ​by zip code).

I have the mlv() function working for numeric (double and integer) data,
but it throws either an error or a warning and produces unexpected output
with character data. Any help is appreciated.

A simple example:


> my.rand.letters <- sample(letters, size=100, replace=TRUE)

> mlv(my.rand.letters, mode=C("discrete"))Error in match.arg(x, .distribList) : 
> 'arg' must be of length 1In addition: There were 21 warnings (use warnings() 
> to see them)



> mlv(as.factor(my.rand.letters))Mode (most frequent value): NA NA
Bickel's modal skewness: -2
Call: mlv.factor(x = as.factor(my.rand.letters)) Warning message:In
discrete(x, ...) : NAs introduced by coercion



TIA,

Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: modeest with non-numeric data?

2013-07-27 Thread Tom Hopper
Hello,

(Apologies for the repost, but it appears that the original text was
garbled.)

I have recently discovered the modeest library, and am trying to understand
how to use it with non-numeric data (e.g. determining the most common last
name, or analysing customer demographics by zip code).

I have the mlv() function working for numeric (double and integer) data,
but it throws either an error or a warning and produces unexpected output
with character data. The modeest help is not clear to me on character data,
and I have been unable to find examples online. Any help is appreciated.

A simple example:

> my.rand.letters <- sample(letters, size=100, replace=TRUE)

> mlv(my.rand.letters, mode=C("discrete"))
Error in match.arg(x, .distribList) : 'arg' must be of length 1
In addition: There were 21 warnings (use warnings() to see them)

> mlv(as.factor(my.rand.letters))
Mode (most frequent value): NA NA
Bickel's modal skewness: -2
Call: mlv.factor(x = as.factor(my.rand.letters))
Warning message:
In discrete(x, ...) : NAs introduced by coercion

TIA,

Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replacing a plotting function in a package with a ggplot2 version

2014-03-03 Thread Tom Hopper
I’ve been working on a home-brewed version of the plot function in the package 
qcc that would be more compatible with grid graphics. I’d like to share this 
with the R community, in the hopes that this reference helps some of my fellow 
learners. The issues that I overcame were all documented elsewhere, but not 
always easy to find; hopefully this helps with the search engines.

I have written a replacement for the plot.qcc() function in the qcc package. 
Features that may be of interest:

* The new plot function generates graphs almost identical to the original base 
graphic versions, but using ggplot2, gtable and grid. Some of the differences 
in programming technique are documented (e.g. the need to pass data frames 
rather than individual vectors to ggplot). Several issues were overcome, 
including the search scope of aes() and the use of aes_string() versus aes(), 
especially when using ggplot within a function.

* The new function replaces plot.qcc() in the qcc namespace, so that after 
calling source() with my code, all calls to plot() with a qcc object draw the 
new ggplot2-based version. The code demonstrates how to unbind an existing 
function and bind a new function in its place within the package namespace 
using the following code:

unlockBinding(sym="plot.qcc", env=getNamespace("qcc"));
assignInNamespace(x="plot.qcc", value=plot.qcc, ns=asNamespace("qcc"), 
envir=getNamespace("qcc"));
assign("plot.qcc", plot.qcc, envir=getNamespace("qcc"));
lockBinding(sym="plot.qcc", env=getNamespace("qcc"));

* Finally, the code shows how to annotate a plot outside the data panel, when 
the labels need to be aligned with the data (e.g. plot a reference line with 
geom_hline() and then annotate a label for the line to the right of the plot).

This implementation might not be the most elegant or bug-free, but it has 
passed my testing so far.

The code is available on Github at https://github.com/tomhopper/gcc_ggplot/

Best Regards,

Tom Hopper


signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] When is *interactive* data visualization useful to use?

2011-02-18 Thread Tom Hopper
Tal,

One interactive capability that I have repeatedly wished for (but
never taken the time to develop with the existing R tools) is the
ability to interactively zoom in on and out of a data set, and to
interactively create "call-outs of sections of the data. Much of the
data that I deal with takes the form of time series where both the
full data and small section carry meaningful information.

Some of the capabilities of Deducer approach interactive graphing,
such as adjusting alpha values or smoothers, though the updates don't
happen in quite real-time.

- Tom

On Friday, February 11, 2011, Tal Galili  wrote:
> Hello all,
>
> Before getting to my question, I would like to apologize for asking this
> question here.  My question is not directly an R question, however, I still
> find the topic relevant to R community of users  - especially due to only *
> partial* (current) support for interactive data visualization (see here:
> http://cran.r-project.org/web/views/Graphics.html  were with iplots we are
> waiting for iplots extreme, and with rggobi, it currently can not run with R
> 2.12 and windows 7 OS).
>
> And now for my question:
>
> While preparing for a talk I will give soon, I recently started digging into
> two major (Free) tools for interactive data visualization:
> GGobi
>  and mondrian  - both offer a great range of
> capabilities (even if they're a bit buggy).
>
> I wish to ask for your help in articulating (both to myself, and for my
> future audience) *When is it helpful to use interactive plots? Either for
> data exploration (for ourselves) and data presentation (for a "client")?*
>
> For when explaining the data to a client, I can see the value of animation
> for:
>
>    - Using "identify/linking/brushing" for seeing which data point in the
>    graph is what.
>    - Presenting a sensitivity analysis of the data (e.g: "if we remove this
>    point, here is what we will get)
>    - Showing the effect of different groups in the data (e.g: "let's look at
>    our graphs for males and now for the females")
>    - Showing the effect of time (or age, or in general, offering another
>    dimension to the presentation)
>
> For when exploring the data ourselves, I can see the value of
> identify/linking/brushing when exploring an outlier in a dataset we are
> working on.
>
> But other then these two examples, I am not sure what other practical use
> these techniques offer. Especially for our own data exploration!
>
> It could be argued that the interactive part is good for exploring (For
> example) a different behavior of different groups/clusters in the data. But
> when (in practice) I approached such situation, what I tended to do was to
> run the relevant statistical procedures (and post-hoc tests) - and what I
> found to be significant I would then plot with colors clearly dividing the
> data to the relevant groups. From what I've seen, this is a safer approach
> then "wondering around" the data (which could easily lead to data dredging
> (were the scope of the multiple comparison needed for correction is not even
> clear).
>
> I'd be very happy to read your experience/thoughts on this matter.
>
>
> Thanks in advance,
> Tal
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
> --
>
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cloning R Installation Across Multiple Computers

2012-03-02 Thread Tom Hopper
I would like to set up identical R installations, with the same packages,
on multiple computers and with minimal interaction by users. Ideally, I
would like to have an installation script that the user can just run that
will set up everything, including R itself and base packages.

Standard packages would need to include ggplot2 and its dependencies, Rcmdr
and its dependencies and some Rcmdr plug-ins. Best of all would be to have
the script include JGR and Deducer in the installation. This would be set
up on Windows XP 32-bit, and all packages have to install from local .zip
files; R cannot get to the package servers through our firewall.

I could do the setup once, manually, on one machine, but I'm not sure if I
can simply copy the R installation directory to other computers and have it
still work. I don't think that the Windows registry wouldn't be configured
if I did it this way, and I'm not sure that JGR would be correctly
installed. Another approach would to have the user install R, then copy the
R directory tree from another computer that has been set up, and finally
JGR would have to be installed by the user. I'd like to roll this all into
one, simple step that the users don't have to worry about screwing up.

I recall that there has been some discussion of this on r-help in the past,
but I seem to be using the wrong search terms as I cannot find anything.
Any suggestions for, or pointers to, solutions will be much appreciated.

Thank you,

Tom Hopper

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cloning R Installation Across Multiple Computers

2012-03-03 Thread Tom Hopper
Thank you, Duncan. Building an installer package from source might work.

On Friday, March 2, 2012, Duncan Murdoch  wrote:
> On 12-03-02 5:32 AM, Tom Hopper wrote:

>
> See the Installation and Administration manual, in particular the section
(3.1.8, I think) on building the Inno Setup installer.
>
> You may even find the MSI installer more convenient, but that code is
tested less, and is unsupported.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.