I agree that you should try to use all the data. There is nothing wrong with solving your structure with the data you trust and then extending the resolution when your model is in an advanced state of refinement. If you worry whether your data has added value, you can use paired refinement to find a decent cut-off. The procedure is a bit of work, but you can use the PDB-REDO server (pdb-redo.eu) if you want a ready-made solution.
Cheers, Robbie Sent from my Windows 10 phone Van: Nicolas FOOS<mailto:nicolas.f...@esrf.fr> Verzonden: woensdag 29 maart 2017 18:19 Aan: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> Onderwerp: Re: [ccp4bb] Large number of outliers in the dataset Dear Juliana, all the statistics presented here looks good in terms of resolution cut (maybe I will be less sever). For me the point is about the mosaicity you report 1.90 it's high in my opinion. How looks you images? I am wondering if the indexation is really right. And maybe the complain of Xtriage about outlier is due to this high mosaicity. What is the diagnostic of Xtriage in terms of possible twinning? I am also wondering about a pseudo translation. Maybe try to re-processed your data in this direction. Hope to help. Nicolas Nicolas Foos PhD Structural Biology Group European Synchrotron Radiation Facility (E.S.R.F) 71, avenue des Martyrs CS 40220 38043 GRENOBLE Cedex 9 +33 (0)6 76 88 14 87 +33 (0)4 76 88 45 19 On 29/03/2017 17:56, Mark J van Raaij wrote: To be really convinced I think you should also compare the maps at 2.6 and 2.3 Å. If the 2.3 Å map looks better, go for it. If it doesn’t look better, perhaps you are adding noise, but the I/sigma and CC1/2 values suggest you aren’t. Perhaps try 2.5 and 2.4 Å also. And perhaps remove a well-ordered aa from the input model, refine at different resolutions and compare the difference maps for that aa. Or calculate omit maps at different resolutions and compare those. Mark J van Raaij Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC calle Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://wwwuser.cnb.csic.es/~mjvanraaij<http://wwwuser.cnb.csic.es/%7Emjvanraaij> On 29 Mar 2017, at 17:44, Phil Evans <p...@mrc-lmb.cam.ac.uk<mailto:p...@mrc-lmb.cam.ac.uk>> wrote: It is not clear to me why you believe that cutting the resolution of the data would improve your model (which after all is the aim of refinement). At the edge CC(1/2) and I/sigI are perfectly respectable, and there doesn’t seem to be anything wrong with the Wilson plot. Th R-factor will of course be higher if you include more weak data, but minimising R is _not_ the aim of refinement. You should keep all the data I don’t know what xtriage means by “large number of outliers”: perhaps someone else can explain Phil On 29 Mar 2017, at 14:54, Juliana Ferreira de Oliveira <juliana.olive...@lnbio.cnpem.br<mailto:juliana.olive...@lnbio.cnpem.br>> wrote: Hello, I have one dataset at 2.3 Å (probably it can be better, I/σ = 2.1 and CC1/2 = 0.779, the summary data is below), but when I perform Xtriage analysis it says that “There are a large number of outliers in the data”. The space group is P212121. When I refine the MR solution the Rfree stops around 30% and it doesn´t decrease (in fact if I continue refining it starts to increase). The Wilson plot graph is not fitting very well between 2.3 and 2.6 Å: <image001.jpg> So I decided to cut the data at 2.6A and Xtriage analysis doesn’t notify about outliers anymore. I could refine the MR solution very well, the final Rwork is 0.2427 and Rfree = 0.2730 and validation on Phenix results in a good structure. I run Zanuda to confirm the space group and it says that the space group assignment seems to be correct. Do you think that I can improve my structure and solve it at 2.3 Å or better? Or I can finish it with 2.6 Å? To publish at 2.6 Å I need to justify the resolution cut, right? What should I say? Thank you for your help! Regards, Juliana Summary data: Overall InnerShell OuterShell Low resolution limit 51.51 51.51 2.42 High resolution limit 2.30 7.27 2.30 Rmerge 0.147 0.054 0.487 Rmerge in top intensity bin 0.080 - - Rmeas (within I+/I-) 0.155 0.057 0.516 Rmeas (all I+ & I-) 0.155 0.057 0.516 Rpim (within I+/I-) 0.048 0.017 0.164 Rpim (all I+ & I-) 0.048 0.017 0.164 Fractional partial bias -0.006 -0.003 0.146 Total number of observations 83988 2907 11885 Total number unique 8145 307 1167 Mean((I)/sd(I)) 9.3 23.9 2.1 Mn(I) half-set correlation CC(1/2) 0.991 0.998 0.779 Completeness 99.9 99.5 100.0 Multiplicity 10.3 9.5 10.2 Average unit cell: 37.57 51.51 88.75 90.00 90.00 90.00 Space group: P212121 Average mosaicity: 1.90 Juliana Ferreira de Oliveira Brazilian Laboratory of Biosciences - LNBio Brazilian Center for Research in Energy and Materials - CNPEM Campinas-SP, Brazil