Hello everyone, I have a data frame D with 4 columns id,X,Y,C. I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values as a color. (id is just a text string not used for the plot)
But actually, I don't want to use the raw values of D$C, I would prefer to calculate the average values of D$C according to the density of points in a fixed neighborhood. In other words, I would like to smooth the colors according to the density of points. I am looking for any function,package that could solve this. So far, I've been looking at library MASS and the function kde2d which can calculate the density of points in 2 directions, but I don't see how I could then use this information to recalculate my D$C values. Here is a piece of the matrix : > head(D) id X Y C 1 O13297 44.444444 21.61220 -0.136651639 2 O13329 31.272085 4.01590 -0.117016949 3 O13525 6.865672 2.43884 -0.161173913 4 O13539 14.176245 7.81217 -0.075756757 5 O13541 73.275862 3.59012 -0.006988235 6 O13547 28.991597 258.99900 -0.013985507 > dim(D) [1] 3616 4 > apply(D[,-1],2,range) X Y C [1,] 0.3378378 0.0003 -0.7382222 [2,] 100.0000000 24556.4000 0.5582500 (Y is not linear, so I use log='y' in the plot function) I used a palette of 100 colors ranging from Blue to Yellow to red. >pal = colorRampPalette(c("blue","yellow","red"))(100) To make D$C values correspond to a color, I used a cut with the following breaks (101 breaks from -1.2 to 1.2): > BREAKS [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000 -0.1925 [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368 -0.1336 [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080 -0.1048 [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792 -0.0760 [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504 -0.0472 [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216 -0.0184 [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024 0.0008 0.0040 0.0072 0.0104 [64] 0.0136 0.0168 0.0200 0.0232 0.0264 0.0296 0.0328 0.0360 0.0392 [73] 0.0424 0.0456 0.0488 0.0520 0.0552 0.0584 0.0616 0.0648 0.0680 [82] 0.0712 0.0744 0.0776 0.0808 0.0840 0.0872 0.0904 0.0936 0.0968 [91] 0.1000 0.1250 0.1500 0.1750 0.2000 0.2250 0.2500 0.4875 0.7250 [100] 0.9625 1.2000 > C.levels = as.numeric(cut(D$C,breaks=BREAKS)) >length(C.levels) [1] 3616 C.levels ranges from 2 to 98 and then to plot the colors I used pal[C.levels]. > plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y') [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.