What is the difference between Rmerge and Rsym - I thought they were the same? Rrim == Rmeas I think
Phil > On 10 Jul 2017, at 15:18, John Berrisford <j...@ebi.ac.uk> wrote: > > Dear Herman > > The new PDB deposition system (OneDep) allows you to enter values for Rmerge, > Rsym, Rpim, Rrim and / or CC half. If, during deposition, you do not provide > a value for any of these metrics then we will ask you for a value for one of > them. > > Also, PDB format is a legacy format for the PDB. In 2014 mmCIF became the > archive format for the PDB and some large entries are no longer distributed > in PDB format. mmCIF is not limited by the constraints of punch cards. > > Please see https://www.wwpdb.org/documentation/file-formats-and-the-pdb > > Regards > > John > > PDBe > > > > On 10/07/2017 09:26, herman.schreu...@sanofi.com wrote: >> Dear All, >> >> For me this whole discussion is an example of a large number of people >> barking at the wrong tree. The real issue is not whether data processing >> programs print amongst many quality indicators an Rmerge as well, but the >> fact that the PDB and many journals still insist on using the Rmerge as >> primary quality indicator. As long as this is true, novice scientist might >> be led to believe that Rmerge is the most important quality indicator. As >> soon as the PDB and the journals request some other indicator, this will be >> over. So that is where we should direct our efforts to. >> >> I don't understand at all, why the PDB still insists on an obsolete quality >> indicator. However, the PDB format for the coordinates also dates back to >> the 1960's to be used with punch cards. >> >> My 2 cents. >> Herman >> >> >> >> -----Ursprüngliche Nachricht----- >> Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von >> Edward A. Berry >> Gesendet: Samstag, 8. Juli 2017 22:31 >> An: CCP4BB@JISCMAIL.AC.UK >> Betreff: Re: [ccp4bb] Rmergicide Through Programming >> >> But R-merge is not really narrower as a fraction of the mean value- it just >> gets smaller proportionantly as all the numbers get smaller: >> RMSD of .0043 for R-meas multiplied by factor of 0.022/.027 gives 0.0035 >> which is the RMSD for Rmerge. The same was true in the previous example. You >> could multiply R-meas by .5 or .2 and get a sharper distribution yet! And >> that factor would be constant, where this only applies for super-low >> redundancy. >> >> On 07/08/2017 03:23 PM, James Holton wrote: >>> The expected distribution of Rmeas values is still wider than that of >>> Rmerge for data with I/sigma=30 and average multiplicity=2.0. Graph >>> attached. >>> >>> I expect that anytime you incorporate more than one source of information >>> you run the risk of a noisier statistic because every source of information >>> can contain noise. That is, Rmeas combines information about multiplicity >>> with the absolute deviates in the data to form a statistic that is more >>> accurate that Rmerge, but also (potentially) less precise. >>> >>> Perhaps that is what we are debating here? Which is better? accuracy or >>> precision? Personally, I prefer to know both. >>> >>> -James Holton >>> MAD Scientist >>> >>> On 7/8/2017 11:02 AM, Frank von Delft wrote: >>>> It is quite easy to end up with low multiplicities in the low resolution >>>> shell, especially for low symmetry and fast-decaying crystals. >>>> >>>> It is this scenario where Rmerge (lowres) is more misleading than Reas. >>>> >>>> phx >>>> >>>> >>>> On 08/07/2017 17:31, James Holton wrote: >>>>> What does Rmeas tell us that Rmerge doesn't? Given that we know the >>>>> multiplicity? >>>>> >>>>> -James Holton >>>>> MAD Scientist >>>>> >>>>> On 7/8/2017 9:15 AM, Frank von Delft wrote: >>>>>> Anyway, back to reality: does anybody still use R statistics to >>>>>> evaluate anything other than /strong/ data? Certainly I never look at >>>>>> it except for the low-resolution bin (or strongest reflections). >>>>>> Specifically, a "2%-dataset" in that bin is probably healthy, while a >>>>>> "9%-dataset" probably Has Issues. >>>>>> >>>>>> In which case, back to Jacob's question: what does Rmerge tell us that >>>>>> Rmeas doesn't. >>>>>> >>>>>> phx >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 08/07/2017 17:02, James Holton wrote: >>>>>>> Sorry for the confusion. I was going for brevity! And failed. >>>>>>> >>>>>>> I know that the multiplicity correction is applied on a per-hkl basis >>>>>>> in the calculation of Rmeas. However, the average multiplicity over >>>>>>> the whole calculation is most likely not an integer. Some hkls may be >>>>>>> observed twice while others only once, or perhaps 3-4 times in the same >>>>>>> scaling run. >>>>>>> >>>>>>> Allow me to do the error propagation properly. Consider the scenario: >>>>>>> >>>>>>> Your outer resolution bin has a true I/sigma = 1.00 and average >>>>>>> multiplicity of 2.0. Let's say there are 100 hkl indices in this bin. >>>>>>> I choose the "true" intensities of each hkl from an exponential (aka >>>>>>> Wilson) distribution. Further assume the background is high, so the >>>>>>> error in each observation after background subtraction may be taken >>>>>>> from a Gaussian distribution. Let's further choose the per-hkl >>>>>>> multiplicity from a Poisson distribution with expectation value 2.0, so >>>>>>> 0 is possible, but the long-term average multiplicity is 2.0. For R >>>>>>> calculation, when multiplicity of any given hkl is less than 2 it is >>>>>>> skipped. What I end up with after 120,000 trials is a distribution of >>>>>>> values for each R factor. See attached graph. >>>>>>> >>>>>>> What I hope is readily apparent is that the distribution of Rmerge >>>>>>> values is taller and sharper than that of the Rmeas values. The most >>>>>>> likely Rmeas is 80% and that of Rmerge is 64.6%. This is expected, of >>>>>>> course. But what I hope to impress upon you is that the most likely >>>>>>> value is not generally the one that you will get! The distribution has >>>>>>> a width. Specifically, Rmeas could be as low as 40%, or as high as >>>>>>> 209%, depending on the trial. Half of the trial results falling >>>>>>> between 71.4% and 90.3%, a range of 19 percentage points. Rmerge has a >>>>>>> middle-half range from 57.6% to 72.9% (15.3 percentage points). This >>>>>>> range of possible values of Rmerge or Rmeas from data with the same >>>>>>> intrinsic quality is what I mean when I say "numerical instability". >>>>>>> Each and every trial had the same true I/sigma and multiplicity, and >>>>>>> yet the R factors I get vary depending on the trial. Unfortunately for >>>>>>> most of us with real data, you only ever get one trial, and you can't >>>>>>> predict which Rmeas or Rmerge you'll get. >>>>>>> >>>>>>> My point here is that R statistics in general are not comparable from >>>>>>> experiment to experiment when you are looking at data with low average >>>>>>> intensity and low multiplicity, and it appears that Rmeas is less >>>>>>> stable than Rmerge. Not by much, mind you, but still jumps around more. >>>>>>> >>>>>>> Hope that is clearer? >>>>>>> >>>>>>> Note that in no way am I suggesting that low-multiplicity is the right >>>>>>> way to collect data. Far from it. Especially with modern detectors >>>>>>> that have negligible read-out noise. But when micro crystals only give >>>>>>> off a handful of photons each before they die, low multiplicity might >>>>>>> be all you have. >>>>>>> >>>>>>> -James Holton >>>>>>> MAD Scientist >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 7/7/2017 2:33 PM, Edward A. Berry wrote: >>>>>>>> I think the confusion here is that the "multiplicity correction" >>>>>>>> is applied on each reflection, where it will be an integer 2 or >>>>>>>> greater (can't estimate variance with only one measurement). You >>>>>>>> can only correct in an approximate way using using the average >>>>>>>> multiplicity of the dataset, since it would depend on the distribution >>>>>>>> of multiplicity over the reflections. >>>>>>>> >>>>>>>> And the correction is for r-merge. You don't need to apply a >>>>>>>> correction to R-meas. >>>>>>>> R-meas is a redundancy-independent best estimate of the variance. >>>>>>>> Whatever you would have used R-merge for (hopefully taking >>>>>>>> allowance for the multiplicity) you can use R-meas and not worry about >>>>>>>> multiplicity. >>>>>>>> Again, what information does R-merge provide that R-meas does not >>>>>>>> provide in a more accurate way? >>>>>>>> >>>>>>>> According to the denso manual, one way to artificially reduce >>>>>>>> R-merge is to include reflections with only one measure >>>>>>>> (averaging in a lot of zero's always helps bring an average >>>>>>>> down), and they say there were actually some programs that did >>>>>>>> that. However I'm quite sure none of the ones we rely on today do that. >>>>>>>> >>>>>>>> On 07/07/2017 03:12 PM, Kay Diederichs wrote: >>>>>>>>> James, >>>>>>>>> >>>>>>>>> I cannot follow you. "n approaches 1" can only mean n = 2 because n >>>>>>>>> is integer. And for n=2 the sqrt(n/(n-1)) factor is well-defined. For >>>>>>>>> n=1, neither contributions to Rmeas nor Rmerge nor to any other >>>>>>>>> precision indicator can be calculated anyway, because there's nothing >>>>>>>>> this measurement can be compared against. >>>>>>>>> >>>>>>>>> just my 2 cents, >>>>>>>>> >>>>>>>>> Kay >>>>>>>>> >>>>>>>>> On Fri, 7 Jul 2017 10:57:17 -0700, James Holton >>>>>>>>> <jmhol...@slac.stanford.edu> wrote: >>>>>>>>> >>>>>>>>>> I happen to be one of those people who think Rmerge is a very >>>>>>>>>> useful statistic. Not as a method of evaluating the resolution >>>>>>>>>> limit, which is mathematically ridiculous, but for a host of >>>>>>>>>> other important things, like evaluating the performance of data >>>>>>>>>> collection equipment, and evaluating the isomorphism of different >>>>>>>>>> crystals, to name a few. >>>>>>>>>> >>>>>>>>>> I like Rmerge because it is a simple statistic that has a >>>>>>>>>> simple formula and has not undergone any "corrections". >>>>>>>>>> Corrections increase complexity, and complexity opens the door >>>>>>>>>> to manipulation by the desperate and/or misguided. For >>>>>>>>>> example, overzealous outlier rejection is a common way to abuse >>>>>>>>>> R factors, and it is far too often swept under the rug, >>>>>>>>>> sometimes without the user even knowing about it. This is >>>>>>>>>> especially problematic when working in a regime where the statistic >>>>>>>>>> of interest is unstable, and for R factors this is low intensity >>>>>>>>>> data. >>>>>>>>>> Rejecting just the right "outliers" can make any R factor look >>>>>>>>>> a lot better. Why would Rmeas be any more unstable than >>>>>>>>>> Rmerge? Look at the formula. There is an "n-1" in the >>>>>>>>>> denominator, where n is the multiplicity. So, what happens >>>>>>>>>> when n approaches 1 ? What happens when n=1? This is not to say >>>>>>>>>> Rmerge is better than Rmeas. In fact, I believe the latter is >>>>>>>>>> generally superior to the first, unless you are working near n >>>>>>>>>> = 1. The sqrt(n/(n-1)) is trying to correct for bias in the R >>>>>>>>>> statistic, but fighting one infinity with another infinity is a >>>>>>>>>> dangerous game. >>>>>>>>>> >>>>>>>>>> My point is that neither Rmerge nor Rmeas are easily >>>>>>>>>> interpreted without knowing the multiplicity. If you see Rmeas >>>>>>>>>> = 10% and the multiplicity is 10, then you know what that >>>>>>>>>> means. Same for Rmerge, since at n=10 both stats have nearly >>>>>>>>>> the same value. But if you have Rmeas = 45% and multiplicity = >>>>>>>>>> 1.05, what does that mean? Rmeas will be only 33% if the >>>>>>>>>> multiplicity is rounded up to 1.1. This is what I mean by >>>>>>>>>> "numerical instability", the value of the R statistic itself >>>>>>>>>> becomes sensitive to small amounts of noise, and behaves more >>>>>>>>>> and more like a random number generator. And if you have Rmeas >>>>>>>>>> = 33% and no indication of multiplicity, it is hard to know >>>>>>>>>> what is going on. I personally am a lot more comfortable >>>>>>>>>> seeing qualitative agreement between Rmerge and Rmeas, because that >>>>>>>>>> means the numerical instability of the multiplicity correction >>>>>>>>>> didn't mess anything up. >>>>>>>>>> >>>>>>>>>> Of course, when the intensity is weak R statistics in general >>>>>>>>>> are not useful. Both Rmeas and Rmerge have the sum of all >>>>>>>>>> intensities in the denominator, so when the bin-wide sum >>>>>>>>>> approaches zero you have another infinity to contend with. >>>>>>>>>> This one starts to rear its ugly head once I/sigma drops below >>>>>>>>>> about 3, and this is why our ancestors always applied a sigma >>>>>>>>>> cutoff before computing an R factor. Our small-molecule >>>>>>>>>> colleagues still do this! They call it "R1". And it is an >>>>>>>>>> excellent indicator of the overall relative error. The >>>>>>>>>> relative error in the outermost bin is not meaningful, and strangely >>>>>>>>>> enough nobody ever reported the outer-resolution Rmerge before 1995. >>>>>>>>>> >>>>>>>>>> For weak signals, Correlation Coefficients are better, but for >>>>>>>>>> strong signals CC pegs out at >95%, making it harder to see relative >>>>>>>>>> errors. >>>>>>>>>> I/sigma is what we'd like to know, but the value of "sigma" is >>>>>>>>>> still prone to manipulation by not just outlier rejection, but >>>>>>>>>> massaging the so-called "error model". Suffice it to say, >>>>>>>>>> crystallographic data contain more than one type of error. >>>>>>>>>> Some sources are important for weak spots, others are important >>>>>>>>>> for strong spots, and still others are only apparent in the >>>>>>>>>> mid-range. Some sources of error are only important at low >>>>>>>>>> multiplicity, and others only manifest at high multiplicity. >>>>>>>>>> There is no single number that can be used to evaluate all aspects >>>>>>>>>> of data quality. >>>>>>>>>> >>>>>>>>>> So, I remain a champion of reporting Rmerge. Not in the >>>>>>>>>> high-angle bin, because that is essentially a random number, >>>>>>>>>> but overall Rmerge and low-angle-bin Rmerge next to >>>>>>>>>> multiplicity, Rmeas, CC1/2 and other statistics is the only way >>>>>>>>>> you can glean enough information about where the errors are >>>>>>>>>> coming from in the data. Rmeas is a useful addition because it >>>>>>>>>> helps us correct for multiplicity without having to do math in >>>>>>>>>> our head. Users generally thank you for that. Rmerge, however, >>>>>>>>>> has served us well for more than half a century, and I believe >>>>>>>>>> Uli Arndt knew what he was doing. I hope we all know enough >>>>>>>>>> about history to realize that future generations seldom thank their >>>>>>>>>> ancestors for "protecting" them from information. >>>>>>>>>> >>>>>>>>>> -James Holton >>>>>>>>>> MAD Scientist >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 7/5/2017 10:36 AM, Graeme Winter wrote: >>>>>>>>>>> Frank, >>>>>>>>>>> >>>>>>>>>>> you are asking me to remove features that I like, so I would feel >>>>>>>>>>> that the challenge is for you to prove that this is harmful however: >>>>>>>>>>> >>>>>>>>>>> - at the minimum, I find it a useful check sum that the stats >>>>>>>>>>> are internally consistent (though I interpret it for lots of other >>>>>>>>>>> reasons too) >>>>>>>>>>> - it is faulty I agree, but (with caveats) still useful >>>>>>>>>>> IMHO >>>>>>>>>>> >>>>>>>>>>> Sorry for being terse, but I remain to be convinced that >>>>>>>>>>> removing it increases the amount of information >>>>>>>>>>> >>>>>>>>>>> CC’ing BB as requested >>>>>>>>>>> >>>>>>>>>>> Best wishes Graeme >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 5 Jul 2017, at 17:17, Frank von Delft >>>>>>>>>>>> <frank.vonde...@sgc.ox.ac.uk> wrote: >>>>>>>>>>>> >>>>>>>>>>>> You keep not answering the challenge. >>>>>>>>>>>> >>>>>>>>>>>> It's really simple: what information does Rmerge provide that >>>>>>>>>>>> Rmeas doesn't. >>>>>>>>>>>> >>>>>>>>>>>> (If you answer, email to the BB.) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 05/07/2017 16:04, graeme.win...@diamond.ac.uk wrote: >>>>>>>>>>>>> Dear Frank, >>>>>>>>>>>>> >>>>>>>>>>>>> You are forcefully arguing essentially that others are wrong if >>>>>>>>>>>>> we feel an existing statistic continues to be useful, and instead >>>>>>>>>>>>> insist that it be outlawed so that we may not make use of it, >>>>>>>>>>>>> just in case someone misinterprets it. >>>>>>>>>>>>> >>>>>>>>>>>>> Very well >>>>>>>>>>>>> >>>>>>>>>>>>> I do however express disquiet that we as software developers feel >>>>>>>>>>>>> browbeaten to remove the output we find useful because “the >>>>>>>>>>>>> community” feel that it is obsolete. >>>>>>>>>>>>> >>>>>>>>>>>>> I feel that Jacob’s short story on this thread illustrates that >>>>>>>>>>>>> educating the next generation of crystallographers to understand >>>>>>>>>>>>> what all of the numbers mean is critical, and that a >>>>>>>>>>>>> numerological approach of trying to optimise any one statistic is >>>>>>>>>>>>> essentially doomed. Precisely the same argument could be made for >>>>>>>>>>>>> people cutting the “resolution” at the wrong place in order to >>>>>>>>>>>>> improve the average I/sig(I) of the data set. >>>>>>>>>>>>> >>>>>>>>>>>>> Denying access to information is not a solution to >>>>>>>>>>>>> misinterpretation, from where I am sat, however I acknowledge >>>>>>>>>>>>> that other points of view exist. >>>>>>>>>>>>> >>>>>>>>>>>>> Best wishes Graeme >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 5 Jul 2017, at 12:11, Frank von Delft >>>>>>>>>>>>> <frank.vonde...@sgc.ox.ac.uk<mailto:frank.vonde...@sgc.ox.ac.uk>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Graeme, Andrew >>>>>>>>>>>>> >>>>>>>>>>>>> Jacob is not arguing against an R-based statistic; he's pointing >>>>>>>>>>>>> out that leaving out the multiplicity-weighting is prehistoric >>>>>>>>>>>>> (Diederichs & Karplus published it 20 years ago!). >>>>>>>>>>>>> >>>>>>>>>>>>> So indeed: Rmerge, Rpim and I/sigI give different information. >>>>>>>>>>>>> As you say. >>>>>>>>>>>>> >>>>>>>>>>>>> But no: Rmerge and Rmeas and Rcryst do NOT give different >>>>>>>>>>>>> information. Except: >>>>>>>>>>>>> >>>>>>>>>>>>> * Rmerge is a (potentially) misleading version of Rmeas. >>>>>>>>>>>>> >>>>>>>>>>>>> * Rcryst and Rmerge and Rsym are terms that no longer have >>>>>>>>>>>>> significance in the single cryo-dataset world. >>>>>>>>>>>>> >>>>>>>>>>>>> phx. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 05/07/2017 09:43, Andrew Leslie wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I would like to support Graeme in his wish to retain Rmerge in >>>>>>>>>>>>> Table 1, essentially for exactly the same reasons. >>>>>>>>>>>>> >>>>>>>>>>>>> I also strongly support Francis Reyes comment about the >>>>>>>>>>>>> usefulness of Rmerge at low resolution, and I would add to his >>>>>>>>>>>>> list that it can also, in some circumstances, be more indicative >>>>>>>>>>>>> of the wrong choice of symmetry (too high) than the statistics >>>>>>>>>>>>> that come from POINTLESS (excellent though that program is!). >>>>>>>>>>>>> >>>>>>>>>>>>> Andrew >>>>>>>>>>>>> On 5 Jul 2017, at 05:44, Graeme Winter >>>>>>>>>>>>> <graeme.win...@gmail.com<mailto:graeme.win...@gmail.com>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> HI Jacob >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, I got this - and I appreciate the benefit of Rmeas for >>>>>>>>>>>>> dealing with measuring agreement for small-multiplicity >>>>>>>>>>>>> observations. Having this *as well* is very useful and I agree >>>>>>>>>>>>> Rmeas / Rpim / CC-half should be the primary “quality” statistics. >>>>>>>>>>>>> >>>>>>>>>>>>> However, you asked if there is any reason to *keep* rather >>>>>>>>>>>>> than *eliminate* Rmerge, and I offered one :o) >>>>>>>>>>>>> >>>>>>>>>>>>> I do not see what harm there is reporting Rmerge, even if it is >>>>>>>>>>>>> just used in the inner shell or just used to capture a flavour of >>>>>>>>>>>>> the data set overall. I also appreciate that Rmeas converges to >>>>>>>>>>>>> the same value for large multiplicity i.e.: >>>>>>>>>>>>> >>>>>>>>>>>>> Overall InnerShell OuterShell >>>>>>>>>>>>> Low resolution limit 39.02 39.02 1.39 >>>>>>>>>>>>> High resolution limit 1.35 6.04 1.35 >>>>>>>>>>>>> >>>>>>>>>>>>> Rmerge (within I+/I-) 0.080 0.057 2.871 >>>>>>>>>>>>> Rmerge (all I+ and I-) 0.081 0.059 2.922 >>>>>>>>>>>>> Rmeas (within I+/I-) 0.081 0.058 2.940 >>>>>>>>>>>>> Rmeas (all I+ & I-) 0.082 0.059 2.958 >>>>>>>>>>>>> Rpim (within I+/I-) 0.013 0.009 0.628 >>>>>>>>>>>>> Rpim (all I+ & I-) 0.009 0.007 0.453 >>>>>>>>>>>>> Rmerge in top intensity bin 0.050 - - >>>>>>>>>>>>> Total number of observations 1265512 16212 53490 >>>>>>>>>>>>> Total number unique 17515 224 1280 >>>>>>>>>>>>> Mean((I)/sd(I)) 29.7 104.3 1.5 >>>>>>>>>>>>> Mn(I) half-set correlation CC(1/2) 1.000 1.000 0.778 >>>>>>>>>>>>> Completeness 100.0 99.7 100.0 >>>>>>>>>>>>> Multiplicity 72.3 72.4 41.8 >>>>>>>>>>>>> >>>>>>>>>>>>> Anomalous completeness 100.0 100.0 100.0 >>>>>>>>>>>>> Anomalous multiplicity 37.2 42.7 21.0 >>>>>>>>>>>>> DelAnom correlation between half-sets 0.497 0.766 -0.026 >>>>>>>>>>>>> Mid-Slope of Anom Normal Probability 1.039 - - >>>>>>>>>>>>> >>>>>>>>>>>>> (this is a good case for Rpim & CC-half as resolution limit >>>>>>>>>>>>> criteria) >>>>>>>>>>>>> >>>>>>>>>>>>> If the statistics you want to use are there & some others >>>>>>>>>>>>> also, what is the pressure to remove them? Surely we want to >>>>>>>>>>>>> educate on how best to interpret the entire table above to >>>>>>>>>>>>> get a fuller picture of the overall quality of the data? My >>>>>>>>>>>>> 0th-order request would be to publish the three shells as >>>>>>>>>>>>> above ;o) >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers Graeme >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 4 Jul 2017, at 22:09, Keller, Jacob >>>>>>>>>>>>> <kell...@janelia.hhmi.org<mailto:kell...@janelia.hhmi.org>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I suggested replacing Rmerge/sym/cryst with Rmeas, not Rpim. >>>>>>>>>>>>> Rmeas is simply (Rmerge * sqrt(n/n-1)) where n is the number of >>>>>>>>>>>>> measurements of that reflection. It's merely a way of correcting >>>>>>>>>>>>> for the multiplicity-related artifact of Rmerge, which is >>>>>>>>>>>>> becoming even more of a problem with data sets of increasing >>>>>>>>>>>>> variability in multiplicity. Consider the case of comparing a >>>>>>>>>>>>> data set with a multiplicity of 2 versus one of 100: equivalent >>>>>>>>>>>>> data quality would yield Rmerges diverging by a factor of ~1.4. >>>>>>>>>>>>> But this has all been covered before in several papers. It can be >>>>>>>>>>>>> and is reported in resolution bins, so can used exactly as you >>>>>>>>>>>>> say. So, why not "disappear" Rmerge from the software? >>>>>>>>>>>>> >>>>>>>>>>>>> The only reason I could come up with for keeping it is historical >>>>>>>>>>>>> reasons or comparisons to previous datasets, but anyway those >>>>>>>>>>>>> comparisons would be confounded by variabities in multiplicity >>>>>>>>>>>>> and a hundred other things, so come on, developers, just comment >>>>>>>>>>>>> it out! >>>>>>>>>>>>> >>>>>>>>>>>>> JPK >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: >>>>>>>>>>>>> graeme.win...@diamond.ac.uk<mailto:graeme.win...@diamond.ac. >>>>>>>>>>>>> uk> [mailto:graeme.win...@diamond.ac.uk] >>>>>>>>>>>>> Sent: Tuesday, July 04, 2017 4:37 PM >>>>>>>>>>>>> To: Keller, Jacob >>>>>>>>>>>>> <kell...@janelia.hhmi.org<mailto:kell...@janelia.hhmi.org>> >>>>>>>>>>>>> Cc: ccp4bb@jiscmail.ac.uk<mailto:ccp4bb@jiscmail.ac.uk> >>>>>>>>>>>>> Subject: Re: [ccp4bb] Rmergicide Through Programming >>>>>>>>>>>>> >>>>>>>>>>>>> HI Jacob >>>>>>>>>>>>> >>>>>>>>>>>>> Unbiased estimate of the true unmerged I/sig(I) of your data >>>>>>>>>>>>> (I find this particularly useful at low resolution) i.e. if >>>>>>>>>>>>> your inner shell Rmerge is 10% your data agree very poorly; >>>>>>>>>>>>> if 2% says your data agree very well provided you have >>>>>>>>>>>>> sensible multiplicity… obviously depends on sensible >>>>>>>>>>>>> interpretation. Rpim hides this (though tells you more about >>>>>>>>>>>>> the quality of average measurement) >>>>>>>>>>>>> >>>>>>>>>>>>> Essentially, for I/sig(I) you can (by and large) adjust your >>>>>>>>>>>>> sig(I) values however you like if you were so inclined. You can >>>>>>>>>>>>> only adjust Rmerge by excluding measurements. >>>>>>>>>>>>> >>>>>>>>>>>>> I would therefore defend that - amongst the other stats you >>>>>>>>>>>>> enumerate below - it still has a place >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers Graeme >>>>>>>>>>>>> >>>>>>>>>>>>> On 4 Jul 2017, at 14:10, Keller, Jacob >>>>>>>>>>>>> <kell...@janelia.hhmi.org<mailto:kell...@janelia.hhmi.org>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Rmerge does contain information which complements the others. >>>>>>>>>>>>> >>>>>>>>>>>>> What information? I was trying to think of a counterargument to >>>>>>>>>>>>> what I proposed, but could not think of a reason in the world to >>>>>>>>>>>>> keep reporting it. >>>>>>>>>>>>> >>>>>>>>>>>>> JPK >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 4 Jul 2017, at 12:00, Keller, Jacob >>>>>>>>>>>>> <kell...@janelia.hhmi.org<mailto:kell...@janelia.hhmi.org><mailto:kell...@janelia.hhmi.org>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Dear Crystallographers, >>>>>>>>>>>>> >>>>>>>>>>>>> Having been repeatedly chagrinned about the continued use and >>>>>>>>>>>>> reporting of Rmerge rather than Rmeas or similar, I thought of a >>>>>>>>>>>>> potential way to promote the change: what if merging programs >>>>>>>>>>>>> would completely omit Rmerge/cryst/sym? Is there some reason to >>>>>>>>>>>>> continue to report these stats, or are they just grandfathered >>>>>>>>>>>>> into the software? I doubt that any journal or crystallographer >>>>>>>>>>>>> would insist on reporting Rmerge per se. So, I wonder what >>>>>>>>>>>>> developers would think about commenting out a few lines of their >>>>>>>>>>>>> code, seeing what happens? Maybe a comment to the effect of >>>>>>>>>>>>> "Rmerge is now deprecated; use Rmeas" would be useful as well. >>>>>>>>>>>>> Would something catastrophic happen? >>>>>>>>>>>>> >>>>>>>>>>>>> All the best, >>>>>>>>>>>>> >>>>>>>>>>>>> Jacob Keller >>>>>>>>>>>>> >>>>>>>>>>>>> ******************************************* >>>>>>>>>>>>> Jacob Pearson Keller, PhD >>>>>>>>>>>>> Research Scientist >>>>>>>>>>>>> HHMI Janelia Research Campus / Looger lab >>>>>>>>>>>>> Phone: (571)209-4000 x3159 >>>>>>>>>>>>> Email: >>>>>>>>>>>>> kell...@janelia.hhmi.org<mailto:kell...@janelia.hhmi.org><ma >>>>>>>>>>>>> ilto:kell...@janelia.hhmi.org> >>>>>>>>>>>>> ******************************************* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> This e-mail and any attachments may contain confidential, >>>>>>>>>>>>> copyright and or privileged material, and are for the use of the >>>>>>>>>>>>> intended addressee only. If you are not the intended addressee or >>>>>>>>>>>>> an authorised recipient of the addressee please notify us of >>>>>>>>>>>>> receipt by returning the e-mail and do not use, copy, retain, >>>>>>>>>>>>> distribute or disclose the information in or attached to the >>>>>>>>>>>>> e-mail. >>>>>>>>>>>>> Any opinions expressed within this e-mail are those of the >>>>>>>>>>>>> individual and not necessarily of Diamond Light Source Ltd. >>>>>>>>>>>>> Diamond Light Source Ltd. cannot guarantee that this e-mail or >>>>>>>>>>>>> any attachments are free from viruses and we cannot accept >>>>>>>>>>>>> liability for any damage which you may sustain as a result of >>>>>>>>>>>>> software viruses which may be transmitted in or with the message. >>>>>>>>>>>>> Diamond Light Source Limited (company no. 4375679). >>>>>>>>>>>>> Registered in England and Wales with its registered office >>>>>>>>>>>>> at Diamond House, Harwell Science and Innovation Campus, >>>>>>>>>>>>> Didcot, Oxfordshire, OX11 0DE, United Kingdom >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> > > -- > John Berrisford > PDBe > European Bioinformatics Institute (EMBL-EBI) > European Molecular Biology Laboratory > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD UK > Tel: +44 1223 492529 > > http://www.pdbe.org > http://www.facebook.com/proteindatabank > http://twitter.com/PDBeurope