Re: [ccp4bb] Large number of outliers in the dataset

Robbie Joosten Wed, 29 Mar 2017 09:44:11 -0700

I agree that you should try to use all the data. There is nothing wrong with 
solving your structure with the data you trust and then extending the 
resolution when your model is in an advanced state of refinement. If you worry 
whether your data has added value, you can use paired refinement to find a 
decent cut-off. The procedure is a bit of work, but you can use the PDB-REDO 
server (pdb-redo.eu) if you want a ready-made solution.


Cheers,
Robbie

Sent from my Windows 10 phone

Van: Nicolas FOOS<mailto:[email protected]>
Verzonden: woensdag 29 maart 2017 18:19
Aan: [email protected]<mailto:[email protected]>
Onderwerp: Re: [ccp4bb] Large number of outliers in the dataset


Dear Juliana,

all the statistics presented here looks good in terms of resolution cut (maybe 
I will be less sever). For me the point is about the mosaicity you report 1.90 
it's high in my opinion. How looks you images? I am wondering if the indexation 
is really right. And maybe the complain of Xtriage about outlier is due to this 
high mosaicity. What is the diagnostic of Xtriage in terms of possible 
twinning? I am also wondering about a pseudo translation.
Maybe try to re-processed your data in this direction.

Hope to help.

Nicolas

Nicolas Foos
PhD
Structural Biology Group
European Synchrotron Radiation Facility (E.S.R.F)
71, avenue des Martyrs
CS 40220
38043 GRENOBLE Cedex 9
+33 (0)6 76 88 14 87
+33 (0)4 76 88 45 19


On 29/03/2017 17:56, Mark J van Raaij wrote:
To be really convinced I think you should also compare the maps at 2.6 and 2.3 
Å. If the 2.3 Å map looks better, go for it. If it doesn’t look better, perhaps 
you are adding noise, but the I/sigma and CC1/2 values suggest you aren’t.
Perhaps try 2.5 and 2.4 Å also.
And perhaps remove a well-ordered aa from the input model, refine at different 
resolutions and compare the difference maps for that aa. Or calculate omit maps 
at different resolutions and compare those.

Mark J van Raaij
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
calle Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://wwwuser.cnb.csic.es/~mjvanraaij<http://wwwuser.cnb.csic.es/%7Emjvanraaij>

On 29 Mar 2017, at 17:44, Phil Evans 
<[email protected]<mailto:[email protected]>> wrote:

It is not clear to me why you believe that cutting the resolution of the data 
would improve your model (which after all is the aim of refinement). At the 
edge CC(1/2) and I/sigI are perfectly respectable, and there doesn’t seem to be 
anything wrong with the Wilson plot. Th R-factor will of course be higher if 
you include more weak data, but minimising R is _not_ the aim of refinement. 
You should keep all the data

I don’t know what xtriage means by “large number of outliers”: perhaps someone 
else can explain

Phil



On 29 Mar 2017, at 14:54, Juliana Ferreira de Oliveira 
<[email protected]<mailto:[email protected]>> wrote:

Hello,
I have one dataset at 2.3 Å (probably it can be better, I/σ = 2.1 and CC1/2 = 
0.779, the summary data is below), but when I perform Xtriage analysis it says 
that “There are a large number of outliers in the data”. The space group is 
P212121. When I refine the MR solution the Rfree stops around 30% and it 
doesn´t decrease (in fact if I continue refining it starts to increase).
The Wilson plot graph is not fitting very well between 2.3 and 2.6 Å:

<image001.jpg>

So I decided to cut the data at 2.6A and Xtriage analysis doesn’t notify about 
outliers anymore. I could refine the MR solution very well, the final Rwork is 
0.2427 and Rfree = 0.2730 and validation on Phenix results in a good structure.
I run Zanuda to confirm the space group and it says that the space group 
assignment seems to be correct.
Do you think that I can improve my structure and solve it at 2.3 Å or better? 
Or I can finish it with 2.6 Å? To publish at 2.6 Å I need to justify the 
resolution cut, right? What should I say?
Thank you for your help!
Regards,
Juliana

Summary data:
Overall            InnerShell      OuterShell
Low resolution limit                          51.51              51.51          
     2.42
High resolution limit                          2.30                 7.27        
        2.30
Rmerge                                               0.147               0.054  
             0.487
Rmerge in top intensity bin                0.080               -                
      -
Rmeas (within I+/I-)                          0.155               0.057         
      0.516
Rmeas (all I+ & I-)                            0.155               0.057        
       0.516
Rpim (within I+/I-)                            0.048               0.017        
       0.164
Rpim (all I+ & I-)                              0.048               0.017       
        0.164
Fractional partial bias                        -0.006             -0.003        
     0.146
Total number of observations            83988             2907                
11885
Total number unique                          8145                307            
      1167
Mean((I)/sd(I))                                   9.3                   23.9    
             2.1
Mn(I) half-set correlation CC(1/2)    0.991               0.998               
0.779
Completeness                                     99.9                 99.5      
           100.0
Multiplicity                                        10.3                 9.5    
               10.2

Average unit cell: 37.57 51.51 88.75 90.00 90.00 90.00
Space group: P212121
Average mosaicity: 1.90


Juliana Ferreira de Oliveira
Brazilian Laboratory of Biosciences - LNBio
Brazilian Center for Research in Energy and Materials - CNPEM
Campinas-SP, Brazil

Re: [ccp4bb] Large number of outliers in the dataset

Reply via email to