Yes, Rsleep seems to be just the right thing to use for this: Separating model optimization and model validation in statistical cross-validation as applied to crystallography G. J. Kleywegt Acta Cryst. (2007). D63, 939-940
Practically, it would mean that we split 10% of test reflections into 5% used for optimizations like #1-4, and the other 5% (sleep set) is never ever used for anything. The big question here is: whether this will make any important difference? I suspect, as with many similar things, there will be no clear-cut answer (that is it may or may not make difference, case dependent). Pavel On Mon, Oct 17, 2011 at 8:57 AM, Thomas C. Terwilliger <terwilli...@lanl.gov > wrote: > I think that we are using the test set for many things: > > 1. Determining and communicating to others whether our overall procedure > is overfitting the data. > > 2. Identifying the optimal overall procedure in cases where very different > options are being considered (e.g., should I use TLS). > > 3. Calculating specific parameters (eg sigmaA). > > 4. Identifying the "best" set of overall parameters. > > I would suggest that we should generally restrict our usage of the test > set to purposes #1-3. Given a particular overall procedure for > refinement, a very good set of parameters should be obtainable from the > working set of data. > > In particular, approaches in which many parameters (in the limit... all > parameters) are fit to minimize Rfree do not seem likely to produce the > best model overall. It might be worth doing some experiments with the > super-free set approach to determine whether this is true. > > > >> Hi, > >> > >> On Sun, Oct 16, 2011 at 7:48 PM, Ed Pozharski > >> <epozh...@umaryland.edu>wrote: > >> > >>> On Sat, 2011-10-15 at 11:48 +0300, Nicholas M Glykos wrote: > >>> > > > For structures with a small number of reflections, the > >>> > statistical > >>> > > > noise in the 5% sets can be very significant indeed. We have seen > >>> > > > differences between Rfree values obtained from different sets > >>> > reaching > >>> > > > up to 4%. > >>> > >> > >> this is in line with my observations too. > >> Not surprising at all, though (see my previous post on this subject): a > >> small seemingly insignificant change somewhere may result in refinement > >> taking a different pathway leading to a different local minimum. There > is > >> even way of making practical use of this (Rice, Shamoo & Brunger, 1998; > >> Korostelev, Laurberg & Noller, 2009; ...). > >> > >> This "seemingly insignificant change somewhere" may be: > >> - what Ed mentioned (different noise level in free reflections or simply > >> different strength of reflections in free set between sets); > >> - slightly different staring conditions (starting parameter value); > >> - random seed used in Xray/restraints target weight calculation (applies > >> to > >> phenix.refine), > >> - I can go on for 10+ possibilities. > >> > >> I do not know whether choosing the result with the lowest Rfree is a > good > >> idea or not (after reading Ed's post I am slightly puzzled now), but > >> what's > >> definitely a good idea in my opinion is to know the range of possible > >> R-factor values in your specific case, so you know which difference > >> between > >> two R-factors obtained in two refinement runs is significant and which > one > >> is not. > >> > >> Pavel > >> >