Oh I see, I though the answer follows from that. Fraction is better (or may be fraction with a cap). Hardwiring a number may not always work. For small crystals or small data sets or incomplete datasets say 1000 reflections may mean 50% of the dataset.
All the best, Pavel On Fri, Nov 21, 2014 at 8:09 AM, Keller, Jacob <kell...@janelia.hhmi.org> wrote: > Agree with all of this—but how does it reflect on the original question > of whether to use a percent or an absolute number? > > > > JPK > > > > *From:* Pavel Afonine [mailto:pafon...@gmail.com] > *Sent:* Friday, November 21, 2014 11:02 AM > *To:* Keller, Jacob > *Cc:* CCP4BB@JISCMAIL.AC.UK > *Subject:* Re: [ccp4bb] Free Reflections as Percent and not a Number > > > > Hello, > > > > choice of the size of free (or test, whatever you like to call them) > reflections is important for three different purposes: > > > > - estimation of parameters for ML target for refinement; > > - map calculation (coefficients m&D in 2mFo-DFc or mFo-DFc map are > calculated using test reflections); > > - validation (calculation Rfree). > > > > It is important that free reflections are evenly distributed across the > whole resolution range, and each sufficiently thin resolution bin contains > at least 50 test reflections so that the estimation of ML parameters is > robust and reliable. "Sufficiently thin resolution bin" is such that ML > parameters can be assumed constants in it. > > > > Smaller test sets will result in less stable refinements (refinement > outcome will strongly depend on the choice of test set). > > > > Larger test sets will damage map quality (unless all reflections are used > in map calculation). > > > > Size of free set needs to be sufficiently large so that Rfree is > statistically meaningful. > > > > Nothing new is said above, it's all documented in the literature! > > > > Pavel > > > > > > On Thu, Nov 20, 2014 at 2:43 PM, Keller, Jacob <kell...@janelia.hhmi.org> > wrote: > > Dear Crystallographers, > > I thought that for reliable values for Rfree, one needs only to satisfy > counting statistics, and therefore using at most a couple thousand > reflections should always be sufficient. Almost always, however, some > seemingly-arbitrary percentage of reflections is used, say 5%. Is there any > rationale for using a percentage rather than some absolute number like 1000? > > All the best, > > Jacob >