Re: [ccp4bb] MAD
Hi Peter You are right: the location of the prism experiment is most likely the study at Woolsthorpe, e.g. see http://www.isaacnewton.org.uk/texts/OfColours7 . Newton was admitted to Trinity College in 1661 as a 'sizar' (a paid part-time student employed by the College) but was forced to return to Woolsthorpe (the family home) in August 1665 (http://www.isaacnewton.org.uk/Chronology) to continue studying privately, because the University closed temporarily as a precaution against the Great Plague ('Black Death') which was spreading outwards from the initial outbreak in this country in the London Docklands during the summer of that year. He returned to Trinity in 1667 as a Fellow of the College. So I should have been more precise and said that Newton performed the prism experiment during the time that he was associated with Trinity (it's not clear what the nature of his association with Trinity was during the 2 years he spent doing experiments at Woolsthorpe). Cheers -- Ian On 28 January 2012 09:35, Peter Moody wrote: > Ian, > If you visit Isaac Newton's old home at Woolsthorpe (near here) you will see > a conflicting claim for location of the classic prism experiment. You will > also find an apple tree in the garden, but that is another story.. > > Peter > > PS this is my special ccp4bb email account, it doesn't always get the > attention it deserves. > > > On 19 January 2012 17:50, Ian Tickle wrote: >> >> Perhaps I could chime in with a bit of history as I understand it. >> >> The term 'dispersion' in optics, as everyone who knows their history >> is aware of, refers to the classic experiment by Sir Isaac Newton at >> Trinity College here in Cambridge where he observed white light being >> split up ('dispersed') into its component colours by a prism. This is >> of course due to the variation in refractive index of glass with >> wavelength, so then we arrive at the usual definition of optical >> dispersion as dn/dlambda, i.e. the first derivative of the refractive >> index with respect to the wavelength. >> >> Now the refractive index of an average crystal at around 1 Ang >> wavelength differs by about 1 part in a million from 1, however it can >> be determined by very careful and precise interferometric experiments. >> It's safe to say therefore that the dispersion of X-rays (anomalous >> or otherwise) has no measurable effect whatsoever as far as the >> average X-ray diffraction experiment (SAD, MAD or otherwise) is >> concerned. The question then is how did the term 'anomalous >> dispersion' get to be applied to X-ray diffraction? The answer is >> that it turns out that the equation ('Kramer-Kronig relationship') >> governing X-ray scattering is completely analogous to that governing >> optical dispersion, so it's legitimate to use the term 'dispersive' >> (meaning 'analogous to dispersion') for the real part of the >> wavelength-dependent component of the X-ray scattering factor, because >> the real part of the refractive index is what describes dispersion >> (the imaginary part in both cases describes absorption). >> >> So then from 'dispersive' to 'dispersion' to describe the wavelength >> dependence of X-ray scattering is only a short step, even though it >> only behaves _like_ dispersion in its dependence on wavelength. >> However having two different meanings for the same word can get >> confusing and clearly should be avoided if at all possible. >> >> So what does this have to do with the MAD acronym? I think it stemmed >> from a visit by Wayne Hendrickson to Birkbeck in London some time >> around 1990: he was invited by Tom Blundell to give a lecture on his >> MAD experiments. At that time Wayne called it multi-wavelength >> anomalous dispersion. Tom pointed out that this was really a misnomer >> for the reasons I've elucidated above. Wayne liked the MAD acronym >> and wanted to keep it so he needed a replacement term starting with D >> and diffraction was the obvious choice, and if you look at the >> literature from then on Wayne at least consistently called it >> multi-wavelength anomalous diffraction. >> >> Cheers >> >> -- Ian >> >> On 18 January 2012 18:23, Phil Jeffrey wrote: >> > Can I be dogmatic about this ? >> > >> > Multiwavelength anomalous diffraction from Hendrickson (1991) Science >> > Vol. >> > 254 no. 5028 pp. 51-58 >> > >> > Multiwavelength anomalous diffraction (MAD) from the CCP4 proceedings >> > http://www.ccp4.ac.uk/courses/proceedings/1997/j_smith/main.html >> > >> > Multi-wavelength anomalous-diffraction (MAD) from Terwilliger Acta >> > Cryst. >> > (1994). D50, 11-16 >> > >> > etc. >> > >> > >> > I don't see where the problem lies: >> > >> > a SAD experiment is a single wavelength experiment where you are using >> > the >> > anomalous/dispersive signals for phasing >> > >> > a MAD experiment is a multiple wavelength version of SAD. Hopefully one >> > picks an appropriate range of wavelengths for whatever complex case one >> > has. >> > >> > One can h
Re: [ccp4bb] protein lost on membrane of centricon!!
Rashmi, There are other membrane materials which might bind less. We have also found that there seems to be binding sometimes to the plastic itself. So a different manufacturer can help. Also, always use same centricon for same protein. Sometimes you take a big hit on a centricon's first use and then it is fine for subsequent uses. If the protein prep is not terribly difficult, you could try that centricon a second time. We have used PEG 20 K to pull water out. If you do that you must watch it carefully and immerse the dialysis tubing in final buffer when you are near final concentration. If you just remove the PEG without washing it will continue to concentrate. Doug Ohlendorf From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of rashmi panigrahi Sent: Saturday, January 28, 2012 9:55 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] protein lost on membrane of centricon!! Hi all, I tried to concentrate my protein using vivaspin 20 10,000 MWCO PES. The protein was in 50mM Hepes pH 7.5, 500mM KCl and 10% glycerol. I lost about 90% of my protein on the membrane of the centricon. Please suggest some way of concentrating this protein. Will concentrating using peg 20K be a good alternative?? regards rashmi
Re: [ccp4bb] protein lost on membrane of centricon!!
Another option I didn't mention is presoak your centricons in 30% glycerol over night prior to usage. Jürgen On Jan 28, 2012, at 12:47 PM, Bosch, Juergen wrote: Are you close to the theoretical isoelectric point of your protein ? Change pH of buffer Jürgen .. Jürgen Bosch Johns Hopkins Bloomberg School of Public Health Department of Biochemistry & Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Phone: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-3655 http://web.mac.com/bosch_lab/ On Jan 28, 2012, at 10:55, "rashmi panigrahi" wrote: Hi all, I tried to concentrate my protein using vivaspin 20 10,000 MWCO PES. The protein was in 50mM Hepes pH 7.5, 500mM KCl and 10% glycerol. I lost about 90% of my protein on the membrane of the centricon. Please suggest some way of concentrating this protein. Will concentrating using peg 20K be a good alternative?? regards rashmi .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry & Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] Reasoning for Rmeas or Rpim as Cutoff
Jacob, here's my (personal) take on this: The data quality metrics that everyone uses clearly fall into 2 classes: 'consistency' metrics, i.e. Rmerge/meas/pim and CC(1/2) which measure how well redundant observations agree, and signal/noise ratio metrics, i.e. mean(I/sigma) and completeness, which relate to the information content of the data. IMO the basic problem with all the consistency metrics is that they are not measuring the quantity that is relevant to refinement and electron density maps, namely the information content of the data, at least not in a direct and meaningful way. This is because there are 2 contributors to any consistency metric: the systematic errors (e.g. differences in illuminated volume and absorption) and the random errors (from counting statistics, detector noise etc.). If the data are collected with sufficient redundancy the systematic errors should hopefully largely cancel, and therefore only the random errors will determine the information content. Therefore the systematic error component of the consistency measure (which I suspect is the biggest component, at least for the strong reflections) is not relevant to measuring the information content. If the consistency measure only took into account the random error component (which it can't), then it would be essentially be a measure of information content, if only indirectly (but then why not simply use a direct measure such as the signal/noise ratio?). There are clearly at least 2 distinct problems with Rmerge, first it's including systematic errors in its measure of consistency, second it's not invariant with respect to the redundancy (and third it's useless as a statistic anyway because you can't do any significance tests on it!). The redundancy problem is fixed to some extent with Rpim etc, but that still leaves the other problems. It's not clear to me that CC(1/2) is any better in this respect, since (as far as I understand how it's implemented), one cannot be sure that the systematic errors will cancel for each half-dataset Imean, so it's still likely to contain a large contribution from the irrelevant systematic error component and so mislead in respect of the real data quality exactly in the same way that Rmerge/meas/pim do. One may as well use the Rmerge between the half dataset Imeans, since there would be no redundancy effect (i.e. the redundancy would be 2 for all included reflections). I did some significance tests on CC(1/2) and I got silly results, for example it says that the significance level for the CC is ~ 0.1, but this corresponded to a huge Rmerge (200%) and a tiny mean(I/sigma) (0.4). It seems that (without any basis in statistics whatsoever) the rule-of-thumb CC > 0.5 is what is generally used, but I would be worried that the statistics are so far divorced from the reality - it suggests that something is seriously wrong with the assumptions! Having said all that, the mean(I/sigma) metric, which on the face of it is much more closely related to the information content and therefore should be a more relevant metric than Rmerge/meas/pim & CC(1/2), is not without its own problems (which probably explains the continuing popularity of the other metrics!). First and most obvious, it's a hostage to the estimate of sigma(I) used. I've never been happy with inflating the counting sigmas to include effects of systematic error based on the consistency of redundant measurements, since as I indicated above if the data are collected redundantly in such a way that the systematic errors largely cancel, it implies that the systematic errors should not be included in the estimate of sigma. The fact that then the sigma(I)'s would generally be smaller (at least for the large I's), so the sample variances would be much larger than the counting variances, is irrelevant, because the former includes the systematic errors. Also the I/sigma cut-off used would probably not need to be changed since it affects only the weakest reflections which are largely unaffected by the systematic error correction. The second problem with mean(I/sigma) is also obvious: i.e. it's a mean, and as such it's rather insensitive to the actual distribution of I/sigma(I). For example if a shell contained a few highly significant intensities these could be overwhelmed by a large number of weak data and give an insignificant mean(I/sigma). It seems to me that one should be considering the significance of individual reflections, not the shell averages. Also the average will depend on the width of the resolution bin, so one will get the strange effect that the apparent resolution will depend on how one bins at the data! The assumption being made in taking the bin average is that I/sigma(I) falls off smoothly with d* but that's unlikely to be the reality. It seems to me that a chi-square statistic which takes into account the actual distribution of I/sigma(I) would be a better bet than the bin average, though it's not entirely clear how o
Re: [ccp4bb] MAD
For the history buffs and crystallographers needing some R&R and chill-out, an interesting historic fiction read about the era of Newton and Leibnitz and the foundation of the Royal Society is the Baroque cycle by Neil Stevenson. http://en.wikipedia.org/wiki/The_Baroque_Cycle Cryptonomicon, although written before, picks up a descendent of a character from the Cycle, and can be considered imho the 4th book http://en.wikipedia.org/wiki/Cryptonomicon All together ~ 2400 pages. Cheap on Amazon 3rd party. Book a long vacation. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ian Tickle Sent: Sunday, January 29, 2012 5:23 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] MAD Hi Peter You are right: the location of the prism experiment is most likely the study at Woolsthorpe, e.g. see http://www.isaacnewton.org.uk/texts/OfColours7 . Newton was admitted to Trinity College in 1661 as a 'sizar' (a paid part-time student employed by the College) but was forced to return to Woolsthorpe (the family home) in August 1665 (http://www.isaacnewton.org.uk/Chronology) to continue studying privately, because the University closed temporarily as a precaution against the Great Plague ('Black Death') which was spreading outwards from the initial outbreak in this country in the London Docklands during the summer of that year. He returned to Trinity in 1667 as a Fellow of the College. So I should have been more precise and said that Newton performed the prism experiment during the time that he was associated with Trinity (it's not clear what the nature of his association with Trinity was during the 2 years he spent doing experiments at Woolsthorpe). Cheers -- Ian On 28 January 2012 09:35, Peter Moody wrote: > Ian, > If you visit Isaac Newton's old home at Woolsthorpe (near here) you > will see a conflicting claim for location of the classic prism > experiment. You will also find an apple tree in the garden, but that is another story.. > > Peter > > PS this is my special ccp4bb email account, it doesn't always get the > attention it deserves. > > > On 19 January 2012 17:50, Ian Tickle wrote: >> >> Perhaps I could chime in with a bit of history as I understand it. >> >> The term 'dispersion' in optics, as everyone who knows their history >> is aware of, refers to the classic experiment by Sir Isaac Newton at >> Trinity College here in Cambridge where he observed white light being >> split up ('dispersed') into its component colours by a prism. This >> is of course due to the variation in refractive index of glass with >> wavelength, so then we arrive at the usual definition of optical >> dispersion as dn/dlambda, i.e. the first derivative of the refractive >> index with respect to the wavelength. >> >> Now the refractive index of an average crystal at around 1 Ang >> wavelength differs by about 1 part in a million from 1, however it >> can be determined by very careful and precise interferometric experiments. >> It's safe to say therefore that the dispersion of X-rays (anomalous >> or otherwise) has no measurable effect whatsoever as far as the >> average X-ray diffraction experiment (SAD, MAD or otherwise) is >> concerned. The question then is how did the term 'anomalous >> dispersion' get to be applied to X-ray diffraction? The answer is >> that it turns out that the equation ('Kramer-Kronig relationship') >> governing X-ray scattering is completely analogous to that governing >> optical dispersion, so it's legitimate to use the term 'dispersive' >> (meaning 'analogous to dispersion') for the real part of the >> wavelength-dependent component of the X-ray scattering factor, >> because the real part of the refractive index is what describes >> dispersion (the imaginary part in both cases describes absorption). >> >> So then from 'dispersive' to 'dispersion' to describe the wavelength >> dependence of X-ray scattering is only a short step, even though it >> only behaves _like_ dispersion in its dependence on wavelength. >> However having two different meanings for the same word can get >> confusing and clearly should be avoided if at all possible. >> >> So what does this have to do with the MAD acronym? I think it >> stemmed from a visit by Wayne Hendrickson to Birkbeck in London some >> time around 1990: he was invited by Tom Blundell to give a lecture on >> his MAD experiments. At that time Wayne called it multi-wavelength >> anomalous dispersion. Tom pointed out that this was really a >> misnomer for the reasons I've elucidated above. Wayne liked the MAD >> acronym and wanted to keep it so he needed a replacement term >> starting with D and diffraction was the obvious choice, and if you >> look at the literature from then on Wayne at least consistently >> called it multi-wavelength anomalous diffraction. >> >> Cheers >> >> -- Ian >> >> On 18 January 2012 18:23, Phil Jeffrey wrote: >> > Can I be dogmatic about t
Re: [ccp4bb] protein lost on membrane of centricon!!
Rashmi, I had a similar problem when I used an amicon to concentrate my protein, your buffer composition indicates me that a lot of salt concentration and glicerol makes your protein soluble, I had a similar buffer with a lot of saltt and 15% glicerol, and I simply reason that my protein it is actually soluble but Temperature was also critical so I decided to take the centrifuge inside of a refrigerator and I centrifuge at 10°C or less and my protein survive the process of a possible thermal shock that the protein is induced, sometimes something simple works fine and it did for me. hope this help you. aaron hernandez Universidad nacional autónoma de Mexico UNAM De: "Bosch, Juergen" Para: CCP4BB@JISCMAIL.AC.UK Enviado: Domingo, 29 de enero, 2012 11:08:09 Asunto: Re: [ccp4bb] protein lost on membrane of centricon!! Another option I didn't mention is presoak your centricons in 30% glycerol over night prior to usage. Jürgen On Jan 28, 2012, at 12:47 PM, Bosch, Juergen wrote: Are you close to the theoretical isoelectric point of your protein ? Change pH of buffer >Jürgen > >.. >Jürgen Bosch >Johns Hopkins Bloomberg School of Public Health >Department of Biochemistry & Molecular Biology >Johns Hopkins Malaria Research Institute >615 North Wolfe Street, W8708 >Baltimore, MD 21205 >Phone: +1-410-614-4742 >Lab: +1-410-614-4894 >Fax: +1-410-955-3655 >http://web.mac.com/bosch_lab/ > >On Jan 28, 2012, at 10:55, "rashmi panigrahi" >wrote: > > >Hi all, >> >I tried to concentrate my protein using vivaspin 20 10,000 MWCO PES. >> >The protein was in 50mM Hepes pH 7.5, 500mM KCl and 10% glycerol. >> >I lost about 90% of my protein on the membrane of the centricon. >> > >> >Please suggest some way of concentrating this protein. >> >Will concentrating using peg 20K be a good alternative?? >> > >> >regards >> > >> >rashmi >> .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry & Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] Reasoning for Rmeas or Rpim as Cutoff
Just one thing to add to that very detailed response from Ian. We've tended to use a slightly different approach to determining a sensible resolution cutoff, where we judge whether there's useful information in the highest resolution data by whether it agrees with calculated structure factors computed from a model that hasn't been refined against those data. We first did this with the complex of the Shiga-like toxin B-subunit pentamer with the Gb3 trisaccharide (Ling et al, 1998). From memory, the point where the average I/sig(I) drops below 2 was around 3.3A. However, we had a good molecular replacement model to solve this structure and, after just carrying out rigid-body refinement, we computed a SigmaA plot using data to the edge of the detector (somewhere around 2.7A, again from memory). The SigmaA plot dropped off smoothly to 2.8A resolution, with values well above zero (indicating significantly better than random agreement), then dropped suddenly. So we chose 2.8A as the cutoff. Because there were four pentamers in the asymmetric unit, we could then use 20-fold NCS averaging, which gave a fantastic map. In this case, the averaging certainly helped to pull out something very useful from a very weak signal, because the maps weren't nearly as clear at lower resolution. Since then, a number of other people have applied similar tests. Notably, Axel Brunger has done some careful analysis to show that it can indeed be useful to take data beyond the conventional limits. When you don't have a great MR model, you can do something similar by limiting the resolution for the initial refinement and rebuilding, then assessing whether there's useful information at higher resolution by using the improved model (which hasn't seen the higher resolution data) to compute Fcalcs. By the way, it's not necessary to use a SigmaA plot -- the correlation between Fo and Fc probably works just as well. Note that, when the model has been refined against the lower resolution data, you'll expect a drop in correlation at the resolution cutoff you used for refinement, unless you only use the cross-validation data for the resolution range used in refinement. - Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical ResearchTel: +44 1223 336500 Wellcome Trust/MRC Building Fax: +44 1223 336827 Hills RoadE-mail: rj...@cam.ac.uk Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk On 29 Jan 2012, at 17:25, Ian Tickle wrote: > Jacob, here's my (personal) take on this: > > The data quality metrics that everyone uses clearly fall into 2 > classes: 'consistency' metrics, i.e. Rmerge/meas/pim and CC(1/2) which > measure how well redundant observations agree, and signal/noise ratio > metrics, i.e. mean(I/sigma) and completeness, which relate to the > information content of the data. > > IMO the basic problem with all the consistency metrics is that they > are not measuring the quantity that is relevant to refinement and > electron density maps, namely the information content of the data, at > least not in a direct and meaningful way. This is because there are 2 > contributors to any consistency metric: the systematic errors (e.g. > differences in illuminated volume and absorption) and the random > errors (from counting statistics, detector noise etc.). If the data > are collected with sufficient redundancy the systematic errors should > hopefully largely cancel, and therefore only the random errors will > determine the information content. Therefore the systematic error > component of the consistency measure (which I suspect is the biggest > component, at least for the strong reflections) is not relevant to > measuring the information content. If the consistency measure only > took into account the random error component (which it can't), then it > would be essentially be a measure of information content, if only > indirectly (but then why not simply use a direct measure such as the > signal/noise ratio?). > > There are clearly at least 2 distinct problems with Rmerge, first it's > including systematic errors in its measure of consistency, second it's > not invariant with respect to the redundancy (and third it's useless > as a statistic anyway because you can't do any significance tests on > it!). The redundancy problem is fixed to some extent with Rpim etc, > but that still leaves the other problems. It's not clear to me that > CC(1/2) is any better in this respect, since (as far as I understand > how it's implemented), one cannot be sure that the systematic errors > will cancel for each half-dataset Imean, so it's still likely to > contain a large contribution from the irrelevant systematic error > component and so mislead in respect of the real data quality exactly > in the same way that Rmerge/meas/pim do. One may as well use
[ccp4bb] REMINDER - Call for Beamtime 2012 at EMBL Hamburg
Call for access to Synchrotron Beamline Facilities 2012 * DEADLINE 31/01/2012 24:00 CET * EMBL Hamburg, Germany We announce a call for synchrotron beam time applications in biological small-angle scattering (SAXS) and macromolecular crystallography (MX). Beam-time will be available at the DORIS and PETRA storage rings from March 2012 to February 2013. On the DORIS storage ring, EMBL Hamburg will operate beamlines in SAXS (responsible scientist Dmitri Svergun) and MX (responsible scientist Victor Lamzin). On the PETRA storage ring, we will operate one beamline for biological SAXS (responsible scientist Dmitri Svergun) and one beamline for MX (responsible scientist Thomas Schneider). Electronic beam proposal forms and a detailed description of the beamlines are available at http://www.embl-hamburg.de/ (click on 'Access to Infrastructures'). The deadline for submission of proposals is January, 31st, 2012. An external Project Evaluation Committee will assess the proposals. Access to the EMBL Hamburg facilities will in part be supported by the European Commission, Research Infrastructure Action under the FP7 project BioStruct-X (http://www.biostruct-x.eu/). A new visitor program for high-throughput crystallization, sample preparation, characterization and SAXS is now supported by P-CUBE (http://www.p-cube.eu/) for external users. For further information tel. +49 40-89902-111, s...@embl-hamburg.de (SAXS) b...@embl-hamburg.de (MX).
Re: [ccp4bb] Reasoning for Rmeas or Rpim as Cutoff
Hi all, In the context of the above going discussion can anybody post links for a few relevant articles? Thanks in advance, ARKO On Mon, Jan 30, 2012 at 3:05 AM, Randy Read wrote: > Just one thing to add to that very detailed response from Ian. > > We've tended to use a slightly different approach to determining a > sensible resolution cutoff, where we judge whether there's useful > information in the highest resolution data by whether it agrees with > calculated structure factors computed from a model that hasn't been refined > against those data. We first did this with the complex of the Shiga-like > toxin B-subunit pentamer with the Gb3 trisaccharide (Ling et al, 1998). > From memory, the point where the average I/sig(I) drops below 2 was around > 3.3A. However, we had a good molecular replacement model to solve this > structure and, after just carrying out rigid-body refinement, we computed a > SigmaA plot using data to the edge of the detector (somewhere around 2.7A, > again from memory). The SigmaA plot dropped off smoothly to 2.8A > resolution, with values well above zero (indicating significantly better > than random agreement), then dropped suddenly. So we chose 2.8A as the > cutoff. Because there were four pentamers in the asymmetric unit, we could > then use 20-fold NCS averaging, which gave a fantastic map. In this case, > the averaging certainly helped to pull out something very useful from a > very weak signal, because the maps weren't nearly as clear at lower > resolution. > > Since then, a number of other people have applied similar tests. Notably, > Axel Brunger has done some careful analysis to show that it can indeed be > useful to take data beyond the conventional limits. > > When you don't have a great MR model, you can do something similar by > limiting the resolution for the initial refinement and rebuilding, then > assessing whether there's useful information at higher resolution by using > the improved model (which hasn't seen the higher resolution data) to > compute Fcalcs. By the way, it's not necessary to use a SigmaA plot -- the > correlation between Fo and Fc probably works just as well. Note that, when > the model has been refined against the lower resolution data, you'll expect > a drop in correlation at the resolution cutoff you used for refinement, > unless you only use the cross-validation data for the resolution range used > in refinement. > > - > Randy J. Read > Department of Haematology, University of Cambridge > Cambridge Institute for Medical ResearchTel: +44 1223 336500 > Wellcome Trust/MRC Building Fax: +44 1223 336827 > Hills Road > E-mail: rj...@cam.ac.uk > Cambridge CB2 0XY, U.K. > www-structmed.cimr.cam.ac.uk > > On 29 Jan 2012, at 17:25, Ian Tickle wrote: > > > Jacob, here's my (personal) take on this: > > > > The data quality metrics that everyone uses clearly fall into 2 > > classes: 'consistency' metrics, i.e. Rmerge/meas/pim and CC(1/2) which > > measure how well redundant observations agree, and signal/noise ratio > > metrics, i.e. mean(I/sigma) and completeness, which relate to the > > information content of the data. > > > > IMO the basic problem with all the consistency metrics is that they > > are not measuring the quantity that is relevant to refinement and > > electron density maps, namely the information content of the data, at > > least not in a direct and meaningful way. This is because there are 2 > > contributors to any consistency metric: the systematic errors (e.g. > > differences in illuminated volume and absorption) and the random > > errors (from counting statistics, detector noise etc.). If the data > > are collected with sufficient redundancy the systematic errors should > > hopefully largely cancel, and therefore only the random errors will > > determine the information content. Therefore the systematic error > > component of the consistency measure (which I suspect is the biggest > > component, at least for the strong reflections) is not relevant to > > measuring the information content. If the consistency measure only > > took into account the random error component (which it can't), then it > > would be essentially be a measure of information content, if only > > indirectly (but then why not simply use a direct measure such as the > > signal/noise ratio?). > > > > There are clearly at least 2 distinct problems with Rmerge, first it's > > including systematic errors in its measure of consistency, second it's > > not invariant with respect to the redundancy (and third it's useless > > as a statistic anyway because you can't do any significance tests on > > it!). The redundancy problem is fixed to some extent with Rpim etc, > > but that still leaves the other problems. It's not clear to me that > > CC(1/2) is any better in this respect, since (as far as I understand > > how it's implemented), one cannot be sure that the systematic errors > > will cancel for each half-dataset Imean,