I think that we are using the test set for many things:

1. Determining and communicating to others whether our overall procedure
is overfitting the data.

2. Identifying the optimal overall procedure in cases where very different
options are being considered (e.g., should I use TLS).

3. Calculating specific parameters (eg sigmaA).

4. Identifying the "best" set of overall parameters.

I would suggest that we should generally restrict our usage of the test
set to purposes #1-3.  Given a particular overall procedure for
refinement, a very good set of parameters should be obtainable from the
working set of data.

In particular, approaches in which many parameters (in the limit... all
parameters) are fit to minimize Rfree do not seem likely to produce the
best model overall.  It might be worth doing some experiments with the
super-free set approach to determine whether this is true.


>> Hi,
>>
>> On Sun, Oct 16, 2011 at 7:48 PM, Ed Pozharski
>> <epozh...@umaryland.edu>wrote:
>>
>>> On Sat, 2011-10-15 at 11:48 +0300, Nicholas M Glykos wrote:
>>> > > > For structures with a small number of reflections, the
>>> > statistical
>>> > > > noise in the 5% sets can be very significant indeed. We have seen
>>> > > > differences between Rfree values obtained from different sets
>>> > reaching
>>> > > > up to 4%.
>>>
>>
>> this is in line with my observations too.
>> Not surprising at all, though (see my previous post on this subject): a
>> small seemingly insignificant change somewhere may result in refinement
>> taking a different pathway leading to a different local minimum. There is
>> even way of making practical use of this (Rice, Shamoo & Brunger, 1998;
>> Korostelev, Laurberg & Noller, 2009; ...).
>>
>> This "seemingly insignificant change somewhere" may be:
>> - what Ed mentioned (different noise level in free reflections or simply
>> different strength of reflections in free set between sets);
>> - slightly different staring conditions (starting parameter value);
>> - random seed used in Xray/restraints target weight calculation (applies
>> to
>> phenix.refine),
>> - I can go on for 10+ possibilities.
>>
>> I do not know whether choosing the result with the lowest Rfree is a good
>> idea or not (after reading Ed's post I am slightly puzzled now), but
>> what's
>> definitely a good idea in my opinion is to know the range of possible
>> R-factor values in your specific case, so you know which difference
>> between
>> two R-factors obtained in two refinement runs is significant and which one
>> is not.
>>
>> Pavel
>>

Reply via email to