On Wed, 2010-11-24 at 13:00 -0500, Jonathan P Daily wrote: > I understand that smoothing splines produce continuous models, however the > end product of SiZer is not a single model that is then used in any > predictive manner. Rather, the end product is a map of potential > changepoint locations along a gradient. Are you suggesting that SiZer > would not find a changepoint at the Aswan Dam?
No, not at all. What I am arguing is that SiZer will find a series of models of diminishing bandwidth that detects change points that are smoothly varying between states either side of the change point. In fact SiZer is fitting a series of smoothly varying models and showing you where the derivatives of those smoothly varying functions are significantly different from 0. require(SiZer) mod <- SiZer(time(Nile), Nile, x.grid = 40) ## slowish All those models show us that something changed around 1900, but the nature of that change is one that is smoothly varying. My point is that SiZer posits a particular form for the nature of the change - smoothly varying. If all you use is SiZer then these are the sorts of changes you will find. Did the flow at Aswan change smoothly over a period of many years? If all you used was SiZer, then you'd conclude it did. It might look something like: mod2 <- smooth.spline(time(Nile), Nile, df = 10) pred <- predict(mod2, x = seq(min(time(Nile)), max(time(Nile)), length = 200)) plot(Nile) lines(pred, col = "red", lwd = 2) But if we used a different change point detection routine that focussed on a discontinuity, then that is the sort of change point we would detect. Here is a tree version, for example: require(rpart) mod3 <- rpart(Nile ~ time(Nile)) ## 1898 change point plot(Nile) lines(pred, col = "red", lwd = 2) before <- mean(Nile[time(Nile) < 1898]) after <- mean(Nile[time(Nile) >= 1898]) lines(1871:1897, y = rep(before, 27), col = "blue", lwd = 2) lines(1898:1970, y = rep(after, 73), col = "blue", lwd = 2) So, in short, my point was you need to understand what form of change point a given technique is looking to identify otherwise you could be misled as to what is going on. G > If so, I would argue > against that conclusion. If your point is that pulling one underlying > spline model used to create a SiZer map out and trying to draw conclusions > from that model is ineffective, you are absolutely right. I tend to think > of SiZer outputs more akin to tree outputs than piecewise regression > outputs. > -------------------------------------- > Jonathan P. Daily > Technician - USGS Leetown Science Center > 11649 Leetown Road > Kearneysville WV, 25430 > (304) 724-4480 > "Is the room still a room when its empty? Does the room, > the thing itself have purpose? Or do we, what's the word... imbue it." > - Jubal Early, Firefly > > Gavin Simpson <gavin.simp...@ucl.ac.uk> wrote on 11/24/2010 12:09:15 PM: > > > [image removed] > > > > Re: [R] Population abundance, change point > > > > Gavin Simpson > > > > to: > > > > Jonathan P Daily > > > > 11/24/2010 12:09 PM > > > > Cc: > > > > r-help, carusonm > > > > Please respond to gavin.simpson > > > > On Wed, 2010-11-24 at 10:59 -0500, Jonathan P Daily wrote: > > > I agree that SiZer is not the ultimate answer to all changepoint > analysis, > > > but that is why there are so many changepoint detection methods used. > I > > > will clarify, though, that my understanding of SiZer (which may be > wrong) > > > was that the smoothing splines are just a vessel for finding the > > > changepoints, and made no assumptions about the continuity of the > > > changepoint itself. > > > > But by the nature of using splines, they *you) are positing a model for > > what the changepoint looks like. If you model the classic Nile data with > > a spline model (say using gam()) then it will suggest that there was a > > smooth transition from high flow before the dam at Aswan was built to > > lower flow afterwards. Other techniques that posit a discontinuity as a > > changepoint would fit two, effectively flat [slope 0], linear lines > > either side of the point when the dam was built. > > > > If you didn't know a dam was built and all you fitted was a spline to > > the data, that would likely influence how you interpreted the change. > > > > > One thing that would certainly help, especially with the confidence > > > intervals about 0, is some bandwidth selection standard, though > choosing > > > that standard would be a difficult process to say the least. > > > > That won't help with the lack of independence in the residuals, which > > will result in deflated CI on the derivatives, hence more wiggles seem > > significant... Oh the joy. > > > > Also, there has to be a point where with siZer you are just looking at > > pattern in noise at the small bandwidth end of things... > > > > I'll get off my soap box about now! ;-) > > > > Cheers, > > > > G > > > > > -------------------------------------- > > > Jonathan P. Daily > > > Technician - USGS Leetown Science Center > > > 11649 Leetown Road > > > Kearneysville WV, 25430 > > > (304) 724-4480 > > > "Is the room still a room when its empty? Does the room, > > > the thing itself have purpose? Or do we, what's the word... imbue > it." > > > - Jubal Early, Firefly > > > > > > Gavin Simpson <gavin.simp...@ucl.ac.uk> wrote on 11/24/2010 09:15:55 > AM: > > > > > > > [image removed] > > > > > > > > Re: [R] Population abundance, change point > > > > > > > > Gavin Simpson > > > > > > > > to: > > > > > > > > Jonathan P Daily > > > > > > > > 11/24/2010 09:16 AM > > > > > > > > Cc: > > > > > > > > Mike Marchywka, r-help, r-help-bounces, carusonm > > > > > > > > Please respond to gavin.simpson > > > > > > > > On Wed, 2010-11-17 at 09:17 -0500, Jonathan P Daily wrote: > > > > > Indeed I have looked into various non-standard changepoint > analysis > > > > > methods. I figured the OP was more interested in traditional > methods > > > since > > > > > you have to spend less time justifying your methodology. Wavelets > are > > > one > > > > > potential nontraditional method, as is Significant Zero Crossings > (R > > > > > package SiZer), which fits arbitrary-degree smoothing splines over > a > > > range > > > > > of bandwidth parameters and looks for changes. > > > > > > > > ...By looking to see if the derivative of the fitted curve is > different > > > > from 0 (given a suitable confidence interval on the derivative. My > > > > problem with all of this is that these data are time series and > SiZeR > > > > doesn't take this into account (AFAICS) when computing the > confidence > > > > intervals - they are certainly too narrow for examples I have run. > > > > > > > > Also, if these things are using splines, aren't we already assuming > that > > > > the underlying function is smooth and not a discontinuity? So which > > > > technique the OP chooses will depend on how they think about the > type of > > > > change taking place at the "changepoint" - a point I think you made > > > > earlier Jonathan. > > > > > > > > Don't mean to be too negative, this has been a very useful > discussion > > > > that I am coming to late after a spot of time in the field. > > > > > > > > All the best, > > > > > > > > G > > > > > > > > > With large communities of > > > > > abundance counts, another approach that is gaining popularity is > the > > > > > community-level indicator taxa analysis (TITAN), though that is > not > > > useful > > > > > to the OP. > > > > > -------------------------------------- > > > > > Jonathan P. Daily > > > > > Technician - USGS Leetown Science Center > > > > > 11649 Leetown Road > > > > > Kearneysville WV, 25430 > > > > > (304) 724-4480 > > > > > "Is the room still a room when its empty? Does the room, > > > > > the thing itself have purpose? Or do we, what's the word... imbue > > > > it." > > > > > - Jubal Early, Firefly > > > > > > > > > > Mike Marchywka <marchy...@hotmail.com> wrote on 11/17/2010 > 09:11:11 > > > AM: > > > > > > > > > > > [image removed] > > > > > > > > > > > > RE: [R] Population abundance, change point > > > > > > > > > > > > Mike Marchywka > > > > > > > > > > > > to: > > > > > > > > > > > > jdaily, carusonm > > > > > > > > > > > > 11/17/2010 09:11 AM > > > > > > > > > > > > Cc: > > > > > > > > > > > > r-help, r-help-bounces > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To: carus...@gmail.com > > > > > > > From: jda...@usgs.gov > > > > > > > Date: Wed, 17 Nov 2010 08:45:01 -0500 > > > > > > > CC: r-help@r-project.org; r-help-boun...@r-project.org > > > > > > > Subject: Re: [R] Population abundance, change point > > > > > > > > > > > > > > There are really no set ways to determine a changepoint, since > a > > > > > > > changepoint depends completely on what you decide. Recursive > > > > > partitioning > > > > > > > will fit a best changepoint, but it will pretty much always > fit > > > one. > > > > > This > > > > > > If you are open to newer ideas, > > > > > > have you looked at wavelets at all? these come up on googel > along > > > with > > > > > R. > > > > > > Also with aonly a few points, even 20-30, you coldconsider > > > exhasiutvely > > > > > > fitting slopes to all 2^n subsets and plowing throgh the > histograms > > > > > > looking for anything that may be publishable or illuminating > about > > > your > > > > > data. > > > > > > Fitting to your own model or null hypotheses would make > interesting > > > > > > contrasts of course, " populations remained the same after > atrazine > > > > > spill > > > > > > or asteroid hit" etc. > > > > > > > > > > > > > > > > > > > function can be found in the package rpart: > > > > > > > > > > > > > > > fit <- rpart(count ~ year, control = list(maxdepth = 1)) > > > > > > > > summary(fit) > > > > > > > > > > > > > > However this measure offers no level of confidence. This is > where > > > > > packages > > > > > > > like strucchange and party come into use, as they provide > measures > > > of > > > > > > > confidence. Alternatively, you could look into > regression-based > > > > > methods > > > > > > > where the changepoint is some parameter. Piecewise regression, > for > > > > > > > instance, is as simple as fitting a spline of degree 1 and > > > changepoint > > > > > X: > > > > > > > > > > > > > > > library(splines) > > > > > > > > fit <- lm(count ~ bs(year, knots = X, degree = 1)) > > > > > > > > plot(year, count) > > > > > > > > lines(year, fitted(fit)) > > > > > > > > > > > > > > Then you can fit a regression at each year and compare. > > > Alternatively, > > > > > > > since count data is often noisy, you could easily substitute > > > quantile > > > > > > > regression for linear regression to much of the same effect > > > (assuming > > > > > > > whatever tau you decide, I used 0.8 but this is arbitrary): > > > > > > > > > > > > > > > library(splines) > > > > > > > > library(quantreg) > > > > > > > > fit <- rq(count ~ bs(year, knots = X, degree = 1), tau = > 0.8) > > > > > > > > plot(year, count) > > > > > > > > lines(year, fitted(fit)) > > > > > > > -------------------------------------- > > > > > > > Jonathan P. Daily > > > > > > > Technician - USGS Leetown Science Center > > > > > > > 11649 Leetown Road > > > > > > > Kearneysville WV, 25430 > > > > > > > (304) 724-4480 > > > > > > > "Is the room still a room when its empty? Does the room, > > > > > > > the thing itself have purpose? Or do we, what's the word... > imbue > > > it." > > > > > > > - Jubal Early, Firefly > > > > > > > > > > > > > > r-help-boun...@r-project.org wrote on 11/16/2010 05:30:49 PM: > > > > > > > > > > > > > > > [image removed] > > > > > > > > > > > > > > > > [R] Population abundance, change point > > > > > > > > > > > > > > > > Nicholas M. Caruso > > > > > > > > > > > > > > > > to: > > > > > > > > > > > > > > > > r-help > > > > > > > > > > > > > > > > 11/16/2010 05:32 PM > > > > > > > > > > > > > > > > Sent by: > > > > > > > > > > > > > > > > r-help-boun...@r-project.org > > > > > > > > > > > > > > > > I am trying to understand my population abundance data and > am > > > > > looking > > > > > > > into > > > > > > > > analyses of change point to try and determine, at > approximately > > > what > > > > > > > point > > > > > > > > do populations begin to change (either decline or > increasing). > > > > > > > > > > > > > > > > Can anyone offer suggestions on ways to go about this? > > > > > > > > > > > > > > > > I have looked into bcp and strucchange packages but am not > > > > > completely > > > > > > > > convinced that these are appropriate for my data. > > > > > > > > > > > > > > > > Here is an example of what type of data I have > > > > > > > > Year of survey (continuous variable) 1960 - 2009 (there are > gaps > > > in > > > > > the > > > > > > > > surveys (e.g., there were no surveys from 2002-2004) > > > > > > > > Relative abundance of salamanders during the survey periods > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your help, Nick > > > > > > > > > > > > > > > > -- > > > > > > > > Nicholas M Caruso > > > > > > > > Graduate Student > > > > > > > > CLFS-Biology > > > > > > > > 4219 Biology-Psychology Building > > > > > > > > University of Maryland, College Park, MD 20742-5815 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > I learned something of myself in the woods today, > > > > > > > > and walked out pleased for having made the acquaintance. > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > > > > > > > > > ______________________________________________ > > > > > > > > R-help@r-project.org mailing list > > > > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > > > > PLEASE do read the posting guide > > > > > > > http://www.R-project.org/posting-guide.html > > > > > > > > and provide commented, minimal, self-contained, reproducible > > > > code. > > > > > > > > > > > > > > ______________________________________________ > > > > > > > R-help@r-project.org mailing list > > > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > > > PLEASE do read the posting guide > > > > > http://www.R-project.org/posting-guide.html > > > > > > > and provide commented, minimal, self-contained, reproducible > code. > > > > > > > > > > > > > > > > ______________________________________________ > > > > > R-help@r-project.org mailing list > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > > > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > > > > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > > > > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > > > > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > > > > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > > > > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.