[ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread M T
Dear BBers,

I am trying to refine a structure using COOT and Refmac5 and I have some
concerns about overrefinement and x-ray term weight in Refmac5, based on
the fact that during refinement to let R factor to drift too far from Rfree
is not good...

So... First question about that : what is too far ? I have some values in
mind like 6% of difference is OK, 10% is not... But is there a relation in
between resolution of the structure and this difference? Should it be
higher at lower resolution, or always around 6-7% independently of the
resolution?

Second question is, ok, I have a too big difference, lets say 9-10%... What
could be the reason of that and on what to play to reduce this difference?

One way I choose is to look at the x-ray term weight (even if I am totally
sure that Refmac5 is doing things better than me), because I saw that the
final rms on BondLength were to constraint (I have in mind that this value
should stays in between 0.02 and 0.01).
So I looked into Refmac log to know where was the starting point and I
found 8.75.
Then I tried several tests  and here are the results:
*

R factor

Rfree
BondLength

BondAngle

ChirVolume

Auto weighting and experimental sigmas boxes checked

0.1932
0.2886

0.0072

1.6426

0.1184

Weighting term at 4 and experimental sigmas box checked

0.1780
0.3159

0.1047

8.1929

0.5937

Weighting term at 4

0.1792
0.3143

0.1008

7.8200

0.5667

Weighting term at 15 and experimental sigmas box checked

0.1783
0.3272

0.2020

1.6569

0.9745

Weighting term at 15

0.1801
0.3279

0.2022

12.5748

0.9792

Weighting term at 8.75

0.1790
0.3235

0.1545

10.5118

0.7909

Auto weighting box checked

0.1948
0.2880

0.0076

1.6308

0.1176



*Refinement Parameters*
[image: image.png]

So like nothing looks satisfying I decided to ask my questions here...

What do you recommend to fix my problem, which is a too large difference
between R and Rfree?

Thank you for answers.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread Eleanor Dodson
You dont give the data resolution - that is very important..

I hate this rule bound approach to Rfree - R differences.. but as a guide
No non-crystallographic symmetry.
1A data - you would expect them to be almost equal
3A data - I would expect a difference of at least 10% - once had the
pleasure of sendng a paper back as unreasonable when the data was only to
3A and the r factors differed by 3%

Pseudo-symmetry and non-crystallographic symmetry can make things more
complicated.
Ideally reflections related by such symmetry should match - either set
belong to the Rfree set, or working set.

The best way to get a low Rfactor and a high Rfree is to have virtually no
geometric restraints. ie have a very high weighting term.
The auto selection works pretty well in my experience..

Not really an answer but some ideas..
Eleanor

On Fri, 6 Mar 2020 at 13:32, M T  wrote:

> Dear BBers,
>
> I am trying to refine a structure using COOT and Refmac5 and I have some
> concerns about overrefinement and x-ray term weight in Refmac5, based on
> the fact that during refinement to let R factor to drift too far from Rfree
> is not good...
>
> So... First question about that : what is too far ? I have some values in
> mind like 6% of difference is OK, 10% is not... But is there a relation in
> between resolution of the structure and this difference? Should it be
> higher at lower resolution, or always around 6-7% independently of the
> resolution?
>
> Second question is, ok, I have a too big difference, lets say 9-10%...
> What could be the reason of that and on what to play to reduce this
> difference?
>
> One way I choose is to look at the x-ray term weight (even if I am totally
> sure that Refmac5 is doing things better than me), because I saw that the
> final rms on BondLength were to constraint (I have in mind that this value
> should stays in between 0.02 and 0.01).
> So I looked into Refmac log to know where was the starting point and I
> found 8.75.
> Then I tried several tests  and here are the results:
> *
>
> R factor
>
> Rfree
> BondLength
>
> BondAngle
>
> ChirVolume
>
> Auto weighting and experimental sigmas boxes checked
>
> 0.1932
> 0.2886
>
> 0.0072
>
> 1.6426
>
> 0.1184
>
> Weighting term at 4 and experimental sigmas box checked
>
> 0.1780
> 0.3159
>
> 0.1047
>
> 8.1929
>
> 0.5937
>
> Weighting term at 4
>
> 0.1792
> 0.3143
>
> 0.1008
>
> 7.8200
>
> 0.5667
>
> Weighting term at 15 and experimental sigmas box checked
>
> 0.1783
> 0.3272
>
> 0.2020
>
> 1.6569
>
> 0.9745
>
> Weighting term at 15
>
> 0.1801
> 0.3279
>
> 0.2022
>
> 12.5748
>
> 0.9792
>
> Weighting term at 8.75
>
> 0.1790
> 0.3235
>
> 0.1545
>
> 10.5118
>
> 0.7909
>
> Auto weighting box checked
>
> 0.1948
> 0.2880
>
> 0.0076
>
> 1.6308
>
> 0.1176
>
>
>
> *Refinement Parameters*
> [image: image.png]
>
> So like nothing looks satisfying I decided to ask my questions here...
>
> What do you recommend to fix my problem, which is a too large difference
> between R and Rfree?
>
> Thank you for answers.
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread Alexandre Ourjoumtsev
Dear Eleanor, dear Michel (?), 

A while ago we statistically analysed the distribution of R, Rfree and of their 
difference as a function of the logarithm of resolution (Urzhumtsev et al., 
2009, Acta Cryst, D65, 1283-1291). 
In this log-scale, the mode values for all three are nearly linear functions 
(see Figs. 4 and 5 and equations (11) and (12) ). 

At 3A resolution, indeed, 3% difference is too small but anyhow 10% seems to be 
not common; just from pure statistics, its mode is rather closer to 5%. At 1A, 
the mode of Rfree-R distribution is near 2%. 

With best wishes, 

Sacha Urzhumtsev 

- Le 6 Mar 20, à 16:22, Eleanor Dodson 
<176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk> a écrit : 

> You dont give the data resolution - that is very important..
> I hate this rule bound approach to Rfree - R differences.. but as a guide
> No non-crystallographic symmetry.
> 1A data - you would expect them to be almost equal
> 3A data - I would expect a difference of at least 10% - once had the pleasure 
> of
> sendng a paper back as unreasonable when the data was only to 3A and the r
> factors differed by 3%

> Pseudo-symmetry and non-crystallographic symmetry can make things more
> complicated.
> Ideally reflections related by such symmetry should match - either set belong 
> to
> the Rfree set, or working set.

> The best way to get a low Rfactor and a high Rfree is to have virtually no
> geometric restraints. ie have a very high weighting term.
> The auto selection works pretty well in my experience..

> Not really an answer but some ideas..
> Eleanor

> On Fri, 6 Mar 2020 at 13:32, M T < [ mailto:michel...@gmail.com |
> michel...@gmail.com ] > wrote:

>> Dear BBers,

>> I am trying to refine a structure using COOT and Refmac5 and I have some
>> concerns about overrefinement and x-ray term weight in Refmac5, based on the
>> fact that during refinement to let R factor to drift too far from Rfree is 
>> not
>> good...

>> So... First question about that : what is too far ? I have some values in 
>> mind
>> like 6% of difference is OK, 10% is not... But is there a relation in between
>> resolution of the structure and this difference? Should it be higher at lower
>> resolution, or always around 6-7% independently of the resolution?

>> Second question is, ok, I have a too big difference, lets say 9-10%... What
>> could be the reason of that and on what to play to reduce this difference?

>> One way I choose is to look at the x-ray term weight (even if I am totally 
>> sure
>> that Refmac5 is doing things better than me), because I saw that the final 
>> rms
>> on BondLength were to constraint (I have in mind that this value should stays
>> in between 0.02 and 0.01).
>> So I looked into Refmac log to know where was the starting point and I found
>> 8.75.
>> Then I tried several tests and here are the results:
>>  
>> *

>> R factor



>> RfreeBondLength

>>  BondAngle



>> ChirVolume



>> Auto weighting and experimental sigmas boxes checked

>> 0.1932   0.2886

>>  0.0072

>>  1.6426



>> 0.1184



>> Weighting term at 4 and experimental sigmas box checked

>> 0.1780   0.3159

>>  0.1047

>>  8.1929



>> 0.5937



>> Weighting term at 4

>> 0.1792   0.3143

>>  0.1008

>>  7.8200



>> 0.5667



>> Weighting term at 15 and experimental sigmas box checked

>> 0.1783   0.3272

>>  0.2020

>>  1.6569



>> 0.9745



>> Weighting term at 15

>> 0.1801   0.3279

>>  0.2022

>>  12.5748



>> 0.9792



>> Weighting term at 8.75

>> 0.1790   0.3235

>>  0.1545

>>  10.5118



>> 0.7909



>> Auto weighting box checked

>> 0.1948   0.2880

>>  0.0076

>>  1.6308



>> 0.1176

>> Refinement Parameters

>> So like nothing looks satisfying I decided to ask my questions here...

>> What do you recommend to fix my problem, which is a too large difference 
>> between
>> R and Rfree?

>> Thank you for answers.

>> To unsubscribe from the CCP4BB list, click the following link:
>> [ https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 |
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 ]
> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 ]



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread 00000c2488af9525-dmarc-request
Well, as no one else has come back to you, I wouldn't say 10 % difference between R and R-free is 'bad' and it's certainly not super-bad! It's a bit high, but if you look at the papers by Ian Tickle from the late 90's and others I think you can be reasonably happy with it. I've been told that A N other well-known protein crystallography suite gives a lower difference for the same structure, so you could always try that ;-)Jon CooperOn 6 Mar 2020 13:36, M T  wrote:Dear BBers,I am trying to refine a structure using COOT and Refmac5 and I have some concerns about overrefinement and x-ray term weight in Refmac5, based on the fact that during refinement to let R factor to drift too far from Rfree is not good...So... First question about that : what is too far ? I have some values in mind like 6% of difference is OK, 10% is not... But is there a relation in between resolution of the structure and this difference? Should it be higher at lower resolution, or always around 6-7% independently of the resolution?Second question is, ok, I have a too big difference, lets say 9-10%... What could be the reason of that and on what to play to reduce this difference?One way I choose is to look at the x-ray term weight (even if I am totally sure that Refmac5 is doing things better than me), because I saw that the final rms on BondLength were to constraint (I have in mind that this value should stays in between 0.02 and 0.01).So I looked into Refmac log to know where was the starting point and I found 8.75.Then I tried several tests  and here are the results: 

*
  
  R factor
  
  
  Rfree
  BondLength
  
  BondAngle
  
  
  ChirVolume
  
  Auto weighting and experimental sigmas boxes checked
  
  0.1932
  0.2886
  
  0.0072
  
  1.6426
  
  
  0.1184
  
  Weighting term at 4 and experimental sigmas box checked
  
  0.1780
  0.3159
  
  0.1047
  
  8.1929
  
  
  0.5937
  
  Weighting term at 4
  
  0.1792
  0.3143
  
  0.1008
  
  7.8200
  
  
  0.5667
  
  Weighting term at 15 and experimental sigmas box checked
  
  0.1783
  0.3272
  
  0.2020
  
  1.6569
  
  
  0.9745
  
  Weighting term at 15
  
  0.1801
  0.3279
  
  0.2022
  
  12.5748
  
  
  0.9792
  
  Weighting term at 8.75
  
  0.1790
  0.3235
  
  0.1545
  
  10.5118
  
  
  0.7909
  
  Auto weighting box checked
  
  0.1948
  0.2880
  
  0.0076
  
  1.6308
  
  
  0.1176
  

 Refinement ParametersSo like nothing looks satisfying I decided to ask my questions here...What do you recommend to fix my problem, which is a too large difference between R and Rfree?Thank you for answers.








To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1



Re: [ccp4bb] [3dem] Which resolution?

2020-03-06 Thread James Holton

Thank you Kay,

Very good points, as always.  I was thinking there must be a better 
apodization filter than cutoffs and B factors.  I'll have to try a 
CC1/2-based roll-off.  But, I wonder if this could be done better on a 
per-reflection basis?  Taking advantage of the sigmas?  I have tried 
using 1/sigma^2 as a weight in map calculation, and that makes the map 
look really weird.  Your idea makes more sense.


On the other hand, the French-Wilson (F&W) truncation procedure is 
supposed to come up with the maximum-likelihood Fourier coefficient 
given the observed intensity and sigma(intensity).  So, the "F" values 
we get from truncate or xdsconv should already be "good"?  Maybe the 
problem is that we are sharpening after the F&W step, rather than 
before?  Or maybe the problem is that F&W bottoms out around F=sig(F).  
Your proposed weight might finish the job...


As for sigmaA, one thing I do know is that refmac5 now uses experimental 
sigmas by default.  Phenix.refine does not.


One thing is for sure, sharpening the data before refinement is not 
going to become a popular strategy.  This is because pre-sharpened data 
will make Rwork and Rfree higher than they are with the "natural" B 
factor.  You can also make your Rwork Rfree much lower by applying a 
positive B factor to your data before starting refinement.  This comes 
with absolutely no improvement in model quality, so please don't try 
this at home.


-James Holton
MAD Scientist

On 3/5/2020 9:38 PM, Kay Diederichs wrote:

Dear James,

important and educational points! This triggers some thoughts ...

The one point where I don't quite agree is with "What about filtering out the noise? 
 An ideal noise suppression filter has the same shape as the signal (I found that in 
Numerical Recipes), and the shape of the signal from a macromolecule is a Gaussian in 
reciprocal space (aka straight line on a Wilson plot). This is true, by the way, for both 
a molecule packed into a crystal or free in solution.  So, the ideal noise-suppression 
filter is simply applying a B factor.  "

I think we can do better than that. We should use the knowledge about the actual signal and its 
noise (which we measure) as a weighting factor, rather than that of the theoretical signal (the 
straight line in the Wilson plot), for the purpose of noise suppression. Formula 13.3.6 in 
Numerical Recipes (3rd ed., 2007), which gives the optimal (Wiener) filter to be used for 
weighting, is phi(f) = S(f)^2/(S(f)^2 + N(f)^2)  . But at high resolution, this is just CC1/2: 
see Box 1 of reference (1) - the formula CC1/2 = 1/(1 + 2/(I/sigma)^2 can be written as CC1/2 = 
^2/(^2 + 2 ^2) where sigma is the estimate of the noise in I ; 
don't know right now why there is a factor of 2).  This goes to zero at the resolution where the 
signal goes to zero, and is near one in the resolution range in which we have good knowledge of 
the signal.
(I only thought about this today, and I also considered CC* as a weighting 
factor, as I understand is suggested by Rosenthal and Henderson, J.Mol.Biol. 
2003, but I cannot convince myself currently that this is right. Anyway, the 
shape of the CC* curve as a function of resolution matches that of CC1/2)
In other words, we should be able to suppress the noise by multiplying the 
Fourier coefficients used for map calculation with (a smooth 
resolution-dependent approximation of) CC1/2. This should allow to sharpen, 
with the best noise suppression we can get.

Thinking about this, we are already typically using weighted Fourier 
coefficients of the form 2mFobs-DFcalc for map calculation. Aren't these 
already weighted in the correct way? I think not - those m and D weights are 
calculated from estimates of model (in-)accuracy and (in-)completeness, but 
don't properly take the measurement errors into account. Of course, since noisy 
data make the sigmaA values worse, the noise in the data influences sigmaA, but 
not in the functionally correct form. To my understanding, the correct way to 
take account of both model and data errors is given by reference (2), which - 
to my knowledge - is not yet implemented except in PHASER.

Hope this makes sense!

Kay

References:
(1) Karplus & Diederichs (2015) Assessing and maximizing data quality in 
macromolecular crystallography.
Curr. Opin. Struct. Biol. 34, 60-68 . PDF at 
https://www.biologie.uni-konstanz.de/typo3temp/secure_downloads/82815/0/2b10c9e6f9a28129e1b119d21aeeab217c918bb1/Karplus2015_CurrOpinStructBiol.pdf
(2) RJ Read, AJ McCoy (2016) A log-likelihood-gain intensity target for 
crystallographic phasing that accounts for experimental error. Acta 
Crystallographica Section D: Structural Biology 72 (3), 375-387
https://scripts.iucr.org/cgi-bin/paper?dz5382

On Thu, 5 Mar 2020 01:11:33 +0100, James Holton  wrote:


The funny thing is, although we generally regard resolution as a primary
indicator of data quality the appearance of a density map at the classic
"1-sigma" contour has very little to do

Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread Andrew Leslie
I would like to add a small caveat to Eleanor’s rule about a 3% difference 
being too low for a structure refined against 3Å data.

If the 3Å data is for a structure that has already been solved at much higher 
resolution (e.g. 2Å) and the only difference for the 3Å dataset is a different 
ligand (say) and the structure is solved by molecular replacement using the 
high resolution structure as a model, in those circumstances it is possible 
(and quite acceptable) to have a much lower difference between Rwork and Rfree 
than one might expect, even at low as 3-4%.


Andrew


> On 6 Mar 2020, at 15:22, Eleanor Dodson 
> <176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk> wrote:
> 
> You dont give the data resolution - that is very important..
> 
> I hate this rule bound approach to Rfree - R differences.. but as a guide
> No non-crystallographic symmetry.
> 1A data - you would expect them to be almost equal
> 3A data - I would expect a difference of at least 10% - once had the pleasure 
> of sendng a paper back as unreasonable when the data was only to 3A and the r 
> factors differed by 3% 
> 
> Pseudo-symmetry and non-crystallographic symmetry can make things more 
> complicated. 
> Ideally reflections related by such symmetry should match - either set belong 
> to the Rfree set, or working set.
> 
> The best way to get a low Rfactor and a high Rfree is to have virtually no 
> geometric restraints. ie have a very high weighting term.
> The auto selection works pretty well in my experience..
> 
> Not really an answer but some ideas..
> Eleanor
> 
> On Fri, 6 Mar 2020 at 13:32, M T  > wrote:
> Dear BBers,
> 
> I am trying to refine a structure using COOT and Refmac5 and I have some 
> concerns about overrefinement and x-ray term weight in Refmac5, based on the 
> fact that during refinement to let R factor to drift too far from Rfree is 
> not good...
> 
> So... First question about that : what is too far ? I have some values in 
> mind like 6% of difference is OK, 10% is not... But is there a relation in 
> between resolution of the structure and this difference? Should it be higher 
> at lower resolution, or always around 6-7% independently of the resolution?
> 
> Second question is, ok, I have a too big difference, lets say 9-10%... What 
> could be the reason of that and on what to play to reduce this difference?
> 
> One way I choose is to look at the x-ray term weight (even if I am totally 
> sure that Refmac5 is doing things better than me), because I saw that the 
> final rms on BondLength were to constraint (I have in mind that this value 
> should stays in between 0.02 and 0.01).
> So I looked into Refmac log to know where was the starting point and I found 
> 8.75.
> Then I tried several tests  and here are the results: 
> * 
> R factor
> Rfree
> BondLength
> BondAngle
> ChirVolume
> 
> Auto weighting and experimental sigmas boxes checked
> 0.1932
> 0.2886
> 0.0072
> 1.6426
> 0.1184
> 
> Weighting term at 4 and experimental sigmas box checked
> 0.1780
> 0.3159
> 0.1047
> 8.1929
> 0.5937
> 
> Weighting term at 4
> 0.1792
> 0.3143
> 0.1008
> 7.8200
> 0.5667
> 
> Weighting term at 15 and experimental sigmas box checked
> 0.1783
> 0.3272
> 0.2020
> 1.6569
> 0.9745
> 
> Weighting term at 15
> 0.1801
> 0.3279
> 0.2022
> 12.5748
> 0.9792
> 
> Weighting term at 8.75
> 0.1790
> 0.3235
> 0.1545
> 10.5118
> 0.7909
> 
> Auto weighting box checked
> 0.1948
> 0.2880
> 0.0076
> 1.6308
> 0.1176
> 
>  
> Refinement Parameters
> 
> 
> So like nothing looks satisfying I decided to ask my questions here...
> 
> What do you recommend to fix my problem, which is a too large difference 
> between R and Rfree?
> 
> Thank you for answers.
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 
> 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] [3dem] Which resolution?

2020-03-06 Thread Pavel Afonine
Randy Read's paper in latest Acta D:

Measuring and using information gained by observing diffraction data
http://journals.iucr.org/d/issues/2020/03/00/ba5308/index.html

seems very relevant to this discussion!

Pavel


On Fri, Mar 6, 2020 at 8:44 AM James Holton  wrote:

> Thank you Kay,
>
> Very good points, as always.  I was thinking there must be a better
> apodization filter than cutoffs and B factors.  I'll have to try a
> CC1/2-based roll-off.  But, I wonder if this could be done better on a
> per-reflection basis?  Taking advantage of the sigmas?  I have tried
> using 1/sigma^2 as a weight in map calculation, and that makes the map
> look really weird.  Your idea makes more sense.
>
> On the other hand, the French-Wilson (F&W) truncation procedure is
> supposed to come up with the maximum-likelihood Fourier coefficient
> given the observed intensity and sigma(intensity).  So, the "F" values
> we get from truncate or xdsconv should already be "good"?  Maybe the
> problem is that we are sharpening after the F&W step, rather than
> before?  Or maybe the problem is that F&W bottoms out around F=sig(F).
> Your proposed weight might finish the job...
>
> As for sigmaA, one thing I do know is that refmac5 now uses experimental
> sigmas by default.  Phenix.refine does not.
>
> One thing is for sure, sharpening the data before refinement is not
> going to become a popular strategy.  This is because pre-sharpened data
> will make Rwork and Rfree higher than they are with the "natural" B
> factor.  You can also make your Rwork Rfree much lower by applying a
> positive B factor to your data before starting refinement.  This comes
> with absolutely no improvement in model quality, so please don't try
> this at home.
>
> -James Holton
> MAD Scientist
>
> On 3/5/2020 9:38 PM, Kay Diederichs wrote:
> > Dear James,
> >
> > important and educational points! This triggers some thoughts ...
> >
> > The one point where I don't quite agree is with "What about filtering
> out the noise?  An ideal noise suppression filter has the same shape as the
> signal (I found that in Numerical Recipes), and the shape of the signal
> from a macromolecule is a Gaussian in reciprocal space (aka straight line
> on a Wilson plot). This is true, by the way, for both a molecule packed
> into a crystal or free in solution.  So, the ideal noise-suppression filter
> is simply applying a B factor.  "
> >
> > I think we can do better than that. We should use the knowledge about
> the actual signal and its noise (which we measure) as a weighting factor,
> rather than that of the theoretical signal (the straight line in the Wilson
> plot), for the purpose of noise suppression. Formula 13.3.6 in Numerical
> Recipes (3rd ed., 2007), which gives the optimal (Wiener) filter to be used
> for weighting, is phi(f) = S(f)^2/(S(f)^2 + N(f)^2)  . But at high
> resolution, this is just CC1/2: see Box 1 of reference (1) - the formula
> CC1/2 = 1/(1 + 2/(I/sigma)^2 can be written as CC1/2 = ^2/(^2 + 2
> ^2) where sigma is the estimate of the noise in I ; don't know right
> now why there is a factor of 2).  This goes to zero at the resolution where
> the signal goes to zero, and is near one in the resolution range in which
> we have good knowledge of the signal.
> > (I only thought about this today, and I also considered CC* as a
> weighting factor, as I understand is suggested by Rosenthal and Henderson,
> J.Mol.Biol. 2003, but I cannot convince myself currently that this is
> right. Anyway, the shape of the CC* curve as a function of resolution
> matches that of CC1/2)
> > In other words, we should be able to suppress the noise by multiplying
> the Fourier coefficients used for map calculation with (a smooth
> resolution-dependent approximation of) CC1/2. This should allow to sharpen,
> with the best noise suppression we can get.
> >
> > Thinking about this, we are already typically using weighted Fourier
> coefficients of the form 2mFobs-DFcalc for map calculation. Aren't these
> already weighted in the correct way? I think not - those m and D weights
> are calculated from estimates of model (in-)accuracy and (in-)completeness,
> but don't properly take the measurement errors into account. Of course,
> since noisy data make the sigmaA values worse, the noise in the data
> influences sigmaA, but not in the functionally correct form. To my
> understanding, the correct way to take account of both model and data
> errors is given by reference (2), which - to my knowledge - is not yet
> implemented except in PHASER.
> >
> > Hope this makes sense!
> >
> > Kay
> >
> > References:
> > (1) Karplus & Diederichs (2015) Assessing and maximizing data quality in
> macromolecular crystallography.
> > Curr. Opin. Struct. Biol. 34, 60-68 . PDF at
> https://www.biologie.uni-konstanz.de/typo3temp/secure_downloads/82815/0/2b10c9e6f9a28129e1b119d21aeeab217c918bb1/Karplus2015_CurrOpinStructBiol.pdf
> > (2) RJ Read, AJ McCoy (2016) A log-likelihood-gain intensity target for
> cry

Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread Bernhard Rupp
In addition to Sacha’s work there are a few older papers by Ian Tickle & Cie. 
(summarized in BMC) and a figure of the empirical distributions

http://www.ruppweb.org/Garland/gallery/Ch12/pages/Biomolecular_Crystallography_Fig_12-24.htm

which are broad and – as mentioned - the actual delta values are context 
dependent.

 

Best, BR

 

From: CCP4 bulletin board  On Behalf Of Andrew Leslie
Sent: Friday, March 6, 2020 08:52
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Overrefinement considerations and Refmac5.

 

I would like to add a small caveat to Eleanor’s rule about a 3% difference 
being too low for a structure refined against 3Å data.

 

If the 3Å data is for a structure that has already been solved at much higher 
resolution (e.g. 2Å) and the only difference for the 3Å dataset is a different 
ligand (say) and the structure is solved by molecular replacement using the 
high resolution structure as a model, in those circumstances it is possible 
(and quite acceptable) to have a much lower difference between Rwork and Rfree 
than one might expect, even at low as 3-4%.

 

 

Andrew

 

 

On 6 Mar 2020, at 15:22, Eleanor Dodson 
<176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk 
 > wrote:

 

You dont give the data resolution - that is very important..

 

I hate this rule bound approach to Rfree - R differences.. but as a guide

No non-crystallographic symmetry.

1A data - you would expect them to be almost equal

3A data - I would expect a difference of at least 10% - once had the pleasure 
of sendng a paper back as unreasonable when the data was only to 3A and the r 
factors differed by 3% 

 

Pseudo-symmetry and non-crystallographic symmetry can make things more 
complicated. 

Ideally reflections related by such symmetry should match - either set belong 
to the Rfree set, or working set.

 

The best way to get a low Rfactor and a high Rfree is to have virtually no 
geometric restraints. ie have a very high weighting term.

The auto selection works pretty well in my experience..

 

Not really an answer but some ideas..

Eleanor

 

On Fri, 6 Mar 2020 at 13:32, M T mailto:michel...@gmail.com> > wrote:

Dear BBers,

 

I am trying to refine a structure using COOT and Refmac5 and I have some 
concerns about overrefinement and x-ray term weight in Refmac5, based on the 
fact that during refinement to let R factor to drift too far from Rfree is not 
good...

 

So... First question about that : what is too far ? I have some values in mind 
like 6% of difference is OK, 10% is not... But is there a relation in between 
resolution of the structure and this difference? Should it be higher at lower 
resolution, or always around 6-7% independently of the resolution?

 

Second question is, ok, I have a too big difference, lets say 9-10%... What 
could be the reason of that and on what to play to reduce this difference?

 

One way I choose is to look at the x-ray term weight (even if I am totally sure 
that Refmac5 is doing things better than me), because I saw that the final rms 
on BondLength were to constraint (I have in mind that this value should stays 
in between 0.02 and 0.01).

So I looked into Refmac log to know where was the starting point and I found 
8.75.

Then I tried several tests  and here are the results: 


*

R factor

Rfree

BondLength 

BondAngle

ChirVolume




Auto weighting and experimental sigmas boxes checked

0.1932

0.2886 

0.0072 

1.6426 

0.1184




Weighting term at 4 and experimental sigmas box checked

0.1780

0.3159 

0.1047 

8.1929 

0.5937




Weighting term at 4

0.1792

0.3143 

0.1008 

7.8200 

0.5667




Weighting term at 15 and experimental sigmas box checked

0.1783

0.3272 

0.2020 

1.6569 

0.9745




Weighting term at 15

0.1801

0.3279 

0.2022 

12.5748 

0.9792




Weighting term at 8.75

0.1790

0.3235 

0.1545 

10.5118 

0.7909




Auto weighting box checked

0.1948

0.2880 

0.0076 

1.6308 

0.1176



 

Refinement Parameters



 

So like nothing looks satisfying I decided to ask my questions here...

 

What do you recommend to fix my problem, which is a too large difference 
between R and Rfree?

 

Thank you for answers.

 

  _  

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB 
 &A=1 

 

  _  

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB 
 &A=1 

 

 

  _  

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB 
 &A=1 


#

Re: [ccp4bb] Overrefinement considerations and Refmac5.

2020-03-06 Thread Alexandre Ourjoumtsev
Thank you, Bernhard, for poining out this nice paper by Ian, one more that I 
missed previously. 

Best regards, 

Sacha Urzhumtsev 

- Le 6 Mar 20, à 19:15, Bernhard Rupp  a écrit : 

> In addition to Sacha’s work there are a few older papers by Ian Tickle & Cie.
> (summarized in BMC) and a figure of the empirical distributions

> [
> http://www.ruppweb.org/Garland/gallery/Ch12/pages/Biomolecular_Crystallography_Fig_12-24.htm
> |
> http://www.ruppweb.org/Garland/gallery/Ch12/pages/Biomolecular_Crystallography_Fig_12-24.htm
> ]

> which are broad and – as mentioned - the actual delta values are context
> dependent.

> Best, BR

> From: CCP4 bulletin board  On Behalf Of Andrew Leslie
> Sent: Friday, March 6, 2020 08:52
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Overrefinement considerations and Refmac5.

> I would like to add a small caveat to Eleanor’s rule about a 3% difference 
> being
> too low for a structure refined against 3Å data.

> If the 3Å data is for a structure that has already been solved at much higher
> resolution (e.g. 2Å) and the only difference for the 3Å dataset is a different
> ligand (say) and the structure is solved by molecular replacement using the
> high resolution structure as a model, in those circumstances it is possible
> (and quite acceptable) to have a much lower difference between Rwork and Rfree
> than one might expect, even at low as 3-4%.

> Andrew

>> On 6 Mar 2020, at 15:22, Eleanor Dodson < [
>> mailto:176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk |
>> 176a9d5ebad7-dmarc-requ...@jiscmail.ac.uk ] > wrote:

>> You dont give the data resolution - that is very important..

>> I hate this rule bound approach to Rfree - R differences.. but as a guide

>> No non-crystallographic symmetry.

>> 1A data - you would expect them to be almost equal

>> 3A data - I would expect a difference of at least 10% - once had the 
>> pleasure of
>> sendng a paper back as unreasonable when the data was only to 3A and the r
>> factors differed by 3%

>> Pseudo-symmetry and non-crystallographic symmetry can make things more
>> complicated.

>> Ideally reflections related by such symmetry should match - either set 
>> belong to
>> the Rfree set, or working set.

>> The best way to get a low Rfactor and a high Rfree is to have virtually no
>> geometric restraints. ie have a very high weighting term.

>> The auto selection works pretty well in my experience..

>> Not really an answer but some ideas..

>> Eleanor

>> On Fri, 6 Mar 2020 at 13:32, M T < [ mailto:michel...@gmail.com |
>> michel...@gmail.com ] > wrote:

>>> Dear BBers,

>>> I am trying to refine a structure using COOT and Refmac5 and I have some
>>> concerns about overrefinement and x-ray term weight in Refmac5, based on the
>>> fact that during refinement to let R factor to drift too far from Rfree is 
>>> not
>>> good...

>>> So... First question about that : what is too far ? I have some values in 
>>> mind
>>> like 6% of difference is OK, 10% is not... But is there a relation in 
>>> between
>>> resolution of the structure and this difference? Should it be higher at 
>>> lower
>>> resolution, or always around 6-7% independently of the resolution?

>>> Second question is, ok, I have a too big difference, lets say 9-10%... What
>>> could be the reason of that and on what to play to reduce this difference?

>>> One way I choose is to look at the x-ray term weight (even if I am totally 
>>> sure
>>> that Refmac5 is doing things better than me), because I saw that the final 
>>> rms
>>> on BondLength were to constraint (I have in mind that this value should 
>>> stays
>>> in between 0.02 and 0.01).

>>> So I looked into Refmac log to know where was the starting point and I found
>>> 8.75.

>>> Then I tried several tests and here are the results:

>>> *

>>> R factor


>>> Rfree


>>> BondLength

>>> BondAngle

>>> ChirVolume


>>> Auto weighting and experimental sigmas boxes checked


>>> 0.1932


>>> 0.2886

>>> 0.0072

>>> 1.6426

>>> 0.1184


>>> Weighting term at 4 and experimental sigmas box checked


>>> 0.1780


>>> 0.3159

>>> 0.1047

>>> 8.1929

>>> 0.5937


>>> Weighting term at 4


>>> 0.1792


>>> 0.3143

>>> 0.1008

>>> 7.8200

>>> 0.5667


>>> Weighting term at 15 and experimental sigmas box checked


>>> 0.1783


>>> 0.3272

>>> 0.2020

>>> 1.6569

>>> 0.9745


>>> Weighting term at 15


>>> 0.1801


>>> 0.3279

>>> 0.2022

>>> 12.5748

>>> 0.9792


>>> Weighting term at 8.75


>>> 0.1790


>>> 0.3235

>>> 0.1545

>>> 10.5118

>>> 0.7909


>>> Auto weighting box checked


>>> 0.1948


>>> 0.2880

>>> 0.0076

>>> 1.6308

>>> 0.1176


>>> Refinement Parameters

>>> 

>>> So like nothing looks satisfying I decided to ask my questions here...

>>> What do you recommend to fix my problem, which is a too large difference 
>>> between
>>> R and Rfree?

>>> Thank you for answers.

>>> To unsubscribe from the CCP4BB list, click the following 

[ccp4bb] Running CCP4_Blend on Mac

2020-03-06 Thread Reddiravikumar Kumar
Hi all,

I am trying to run Blend program from CCP4interface (7.0.078) to merge
multiple datasets (9). I am running the CCP4 on Mac mojave 10.14.6 OS and
installed R program in /usr/local/bin/R. In the CCP4 interface, the same
path is added under system administration/configure interface/external
programs/add a program/. When i start the blend program, it is showing
Dependency error: BLEND requires R. Am I missing something here.
Please advise me running blend through ccp4 interface?


Thank you,
-- 
ravi kumar



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Running CCP4_Blend on Mac

2020-03-06 Thread David Waterman
Dear Ravi Kumar

If I remember correctly, Blend requires the path to the executable Rscript
rather than R. I can't check this now but I can look into it on Monday if
you are still having trouble.

Best wishes
David

On Fri, 6 Mar 2020, 21:02 Reddiravikumar Kumar, 
wrote:

> Hi all,
>
> I am trying to run Blend program from CCP4interface (7.0.078) to merge
> multiple datasets (9). I am running the CCP4 on Mac mojave 10.14.6 OS and
> installed R program in /usr/local/bin/R. In the CCP4 interface, the same
> path is added under system administration/configure interface/external
> programs/add a program/. When i start the blend program, it is showing
> Dependency error: BLEND requires R. Am I missing something here.
> Please advise me running blend through ccp4 interface?
>
>
> Thank you,
> --
> ravi kumar
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1