I've been trying to implement bivariate kernel density estimation. For data like mine, function "kde" from package "ks" with bandwidth matrix derived by function "Hscv" seems like a very good choice. Unfortunately, Hscv seems unmanageably slow except for very small sample sizes (up to a few hundred) and my sample sizes are quite large (up to a few thousand). I've reviewed help files, vignettes, previous postings on this list, and the JSS paper describing ks and haven't found much mention of constraints on sample size other than using kfold cross-validation to speed calculation:unfortunately, that option is listed but not enabled for Hscv.
An example illustrates my problem. Each of the following expressions returns the time elapsed to estimate a bandwidth matrix. The first is for a sample of 100 x and y coordinates, the second is for a sample of 200 x and y coordinates. > system.time(Hscv(x=xy.100)) user system elapsed 1.97 0.03 2.00 > system.time(Hscv(x=xy.200)) user system elapsed 6.03 0.17 6.22 I have to do this many, many times and each run will involve up to several thousand records, so you can see my problem. I should think that others must surely have encountered and overcome this challenge. If anyone can kindly point me in a productive direction, I will be most grateful. ----- Glen Sargeant Research Wildlife Biologist -- View this message in context: http://r.789695.n4.nabble.com/Bivariate-kernel-density-bandwidth-selection-tp3080753p3080753.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.