Re: [R] Non-uniformly distributed plot

Dennis Murphy Fri, 24 Dec 2010 12:43:33 -0800

Hi:

One way to combine Jorge's and David's solutions is to visualize the data in
ggplot2 and/or lattice:


library(ggplot2)
x <- c(0.349763, 3.39489, 1.52249, 0.269066, 0.107872, 0.0451689,
0.590268, 0.275755, 0.751845, 1.00599, 0.652409, 2.80664, 0.0269933,
0.137307, 0.282939, 1.23008, 0.436429, 0.0555626, 1.10624, 53,
1.30411, 1.29749, 53, 3.2552, 1.189, 2.23616, 1.13259, 0.505039,
1.05812, 1.18238, 0.500926, 1.0314, 0.733468, 3.13292, 1.26685,
3.10882, 1.01719, 0.13096, 0.0529692, 0.418408, 0.213299, 0.536631,
1.82336, 1.15287, 0.192519, 0.961295, 51, 0.470511, 4.05688,
1.78098, 0.364686, 1.24533)
y <- c(0.423279, 0.473681, 0.629478, 1.09712, 0.396239, 0.273577,
0.303214, 0.628386, 0.465841, 0.687251, 0.544569, 0.635805, 0.358983,
0.16519, 0.366217, 1.08421, 0.668939, 0.181861, 0.782656, 13.3816,
1.15256, 0.965943, 20, 2.86051, 0.304939, 1.94654, 0.967576,
0.647599, 0.520811, 1.27434, 0.363666, 0.93621, 0.544573, 0.696733,
1.0031, 3.78895, 0.694053, 0.289111, 0.178439, 0.746576, 0.391725,
0.363901, 1.20297, 0.461934, 0.364011, 0.691368, 20, 0.81947,
1.69594, 1.56381, 0.900398, 0.960948)

d <- data.frame(x, y)

g <- ggplot(d, aes(log(x), log(y))
g + geom_point() + geom_smooth(colour = 'red', size = 1) +
      geom_smooth(method = 'lm', colour = 'blue', size = 1)

The default smooth is a loess curve, which shows the curvature present in
the residual  vs. fitted plot from Jorge's solution. The predicted values
from the linear model in the log-log scale lie along the blue line. (To get
rid of the confidence curves, add se = FALSE to both geom_smooth() calls
above.) If you were to fit a model to these data in the log-log scale, the
plot indicates that a quadratic polynomial would be a reasonable next step.

This is pretty easy to do in lattice as well (sans the confidence curves):

library(lattice)
xyplot(log(y) ~ log(x), data = d, type = c('p', 'r', 'smooth'),
        pch = 16, col = 'black',
        panel = function(x, y, ...) {
               panel.xyplot(x, y, ..., col.line = 'blue')
               panel.loess(x, y, col.line = 'red')
               }
        )

I needed to write a small panel function to get separate colors for the
least squares line and loess curves, but maybe there's an easier way
(col.line = c('blue', 'red') by itself doesn't work - I tried that - and it
makes sense to me why it doesn't).

Dennis


On Thu, Dec 23, 2010 at 3:50 PM, David Winsemius <dwinsem...@comcast.net>wrote:

>
> On Dec 23, 2010, at 6:41 PM, David Winsemius wrote:
>
>
>> On Dec 23, 2010, at 5:55 PM, Eric Hu wrote:
>>
>>  Thanks David. I am reposting the data here.
>>>
>>
>> Jorge has already responded masterfully. He's apparently less lazy that I
>> and did all the editing. A log transformation as he illustrated can be very
>> useful with bivariate skewed distributions. The only variation I would have
>> suggested would be to record the default par settings and restore them at
>> the end.
>>
>
> You could also repeat the plot an use abline to look at the predicted
> values
>
> plot(x,y, log="xy")
> lines( log(x), fit$predicted)
>
> It's complementary to the residual plot and the QQ plot in the plot.lm
> display for consideration of the possibility that this may not be a truly
> log-log-linear relationship.
>
>
>
>
>> --
>> David
>>
>>>
>>> Eric
>>>
>>>
>>>  Hi,
>>>>
>>>> I would like to plot a linear relationship between variable x and y.
>>>> Can anyone help me with scaled plotting and axes so that all data
>>>> points can be visualized somehow evenly? Plaint plot(x,y) will
>>>> generate condensed points near (0,0) due to several large data
>>>> points. Thank you.
>>>>
>>>> Eric
>>>>
>>>>
>>>> dput(x)
>>>>
>>> c(0.349763, 3.39489, 1.52249, 0.269066, 0.107872, 0.0451689,
>>> 0.590268, 0.275755, 0.751845, 1.00599, 0.652409, 2.80664, 0.0269933,
>>> 0.137307, 0.282939, 1.23008, 0.436429, 0.0555626, 1.10624, 53,
>>> 1.30411, 1.29749, 53, 3.2552, 1.189, 2.23616, 1.13259, 0.505039,
>>> 1.05812, 1.18238, 0.500926, 1.0314, 0.733468, 3.13292, 1.26685,
>>> 3.10882, 1.01719, 0.13096, 0.0529692, 0.418408, 0.213299, 0.536631,
>>> 1.82336, 1.15287, 0.192519, 0.961295, 51, 0.470511, 4.05688,
>>> 1.78098, 0.364686, 1.24533)
>>>
>>>> dput(y)
>>>>
>>> c(0.423279, 0.473681, 0.629478, 1.09712, 0.396239, 0.273577,
>>> 0.303214, 0.628386, 0.465841, 0.687251, 0.544569, 0.635805, 0.358983,
>>> 0.16519, 0.366217, 1.08421, 0.668939, 0.181861, 0.782656, 13.3816,
>>> 1.15256, 0.965943, 20, 2.86051, 0.304939, 1.94654, 0.967576,
>>> 0.647599, 0.520811, 1.27434, 0.363666, 0.93621, 0.544573, 0.696733,
>>> 1.0031, 3.78895, 0.694053, 0.289111, 0.178439, 0.746576, 0.391725,
>>> 0.363901, 1.20297, 0.461934, 0.364011, 0.691368, 20, 0.81947,
>>> 1.69594, 1.56381, 0.900398, 0.960948)
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-uniformly distributed plot

Reply via email to