Dear Quing,

My first suggesting is a no-brainer: try to get better data. At 4.3Å model 
building and refinement will remain painful, whatever you try. Getting better 
data includes improving your crystals (ligands, additives etc.) and the 
cryoconditions. You could also try to collect data on say 100 crystals to see 
if there is one crystal that diffracts better, or is less twinned. However, 
from your question this is probably not an option.

Concerning your current data set, I would also try molecular replacement with 
the complete model (no poly-ala), at least with the complete core. E.g. if your 
protein has an Ile at a certain position and your model has a Leu, this Leu is 
closer to your protein than the Ala you are using now.

I would run molecular replacement in the lower symmetry P3x space groups since 
in this case no asumptions are being made whether the 2-folds are 
crystallographic, non-crystallographic, or generated by twinning. Then I would 
analyze the packing to look if 2-folds are present and whether they could be 
crystallographic, or must be non-crystallographic and whether the packing makes 
sense. If there are no 2-folds present, the twofold symmetry of your data must 
be caused by twinning. If there are 2-folds present, there still could be 
twinning and I would try to generate the twin-related molecule as well and 
examine them together on the graphics to see what the implications are. 

At low resolution, you get severe model bias giving a large split between R and 
Rfree and having twinned data or data with pseudo-crystallographic symmetry 
will not improve the situation either. Probably Randy or Garib could tell you 
more precisely what that means for your R/Rfree's.

My 2 cents,
Herman




-----Original Message-----
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Qing Luan
Sent: Friday, September 07, 2012 1:49 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] poorly diffracting and twinned trigonal crystal

I have a ~4.3 angstrom data set of a trigonal crystal of a seven subunit 
protein complex which I can scale in P3, P31, P32, P321, P3121 and P3221 with 
similar statistics:

P3
Shell Lower Upper Average      Average     Norm. Linear Square
 limit    Angstrom       I   error   stat. Chi**2  R-fac  R-fac
      50.00   9.25  1296.8    89.2    23.5  1.233  0.064  0.077
       9.25   7.35   356.3    18.5     9.7  1.512  0.065  0.066
       7.35   6.42    97.1     8.2     7.5  1.584  0.143  0.140
       6.42   5.83    55.2     8.3     8.1  1.503  0.247  0.241
       5.83   5.42    51.4     9.4     9.3  1.438  0.297  0.284
       5.42   5.10    47.0    10.5    10.5  1.469  0.374  0.345
       5.10   4.84    48.3    11.8    11.9  1.421  0.398  0.383
       4.84   4.63    43.6    12.9    13.1  1.474  0.488  0.449
       4.63   4.45    40.3    14.1    14.2  1.530  0.546  0.477
       4.45   4.30    30.8    14.7    15.0  1.601  0.732  0.631
  All reflections    203.8    19.6    12.3  1.477  0.125  0.085


P3121:

 Shell Lower Upper Average      Average     Norm. Linear Square
 limit    Angstrom       I   error   stat. Chi**2  R-fac  R-fac
      50.00   9.14  1242.9    51.8    18.3  1.200  0.057  0.068
       9.14   7.26   314.0    11.2     6.5  1.454  0.070  0.069
       7.26   6.35    86.9     5.3     5.0  1.499  0.158  0.152
       6.35   5.77    51.9     5.5     5.3  1.248  0.264  0.252
       5.77   5.35    46.9     6.1     6.0  1.213  0.330  0.305
       5.35   5.04    44.3     6.9     6.7  1.137  0.393  0.363
       5.04   4.79    43.4     7.7     7.4  1.109  0.434  0.407
       4.79   4.58    39.2     8.5     8.1  1.128  0.533  0.478
       4.58   4.40    34.2     9.1     8.6  1.115  0.634  0.549
       4.40   4.25    24.9     9.9     9.3  1.064  0.872  0.766
  All reflections    199.0    12.4     8.1  1.216  0.127  0.080

Unit cell parameters: 129.653   129.653   358.280    90.000    90.000   120.000

The systematic absences are consistent with either P31, P32, P3121, or P3221. 
Analyzing the cell contents in P3121 suggests either 1 (Matthews coefficient of 
3.86, 68.2% solvent) or 2 mol/ASU (Matthews coefficient of 1.93, 36.38% solvent)


I built a molecular replacement model (a polyala model containing about 2/3 of 
the protein complex) and ran phaser in multiple space groups with one (for 
P3121 or P3221) or two (P31, P32) copies of the model. Runs in P32 or P3221 
gave no solutions or solutions with TFZ around 4-5. When run in P31 or P3121, 
phaser output solutions with TFZ> 11.0 and what appeared to be good packing. 

Rigid body refinement on the P3121 solution failed to improve the Rfactor (it 
hovered around 55.3%).  Adding the missing subunits (as polyala chains) based 
on the phaser solution and refining with rigid body refinement resulted in a 
model with an Rfree to 48.5. Refining with torsion angle dynamics and 
restrained group B-factor refinement made the Rfree worse - it jumped up to 
about 55.6%. The Rwork values were similar to the Rfree values for each 
attempt. I also tried DEN refinement with similar results. 

Rigid body refinement of the P31 phaser solution gave an Rfree of about 54.4%. 
Adding the missing subunits and running rigid body refinement again improved 
the Rfree to 53.0. Refining with torsion angle dynamics and restrained group 
B-factor refinement again made the Rfree worse (increased to 54.5%).  

I analyzed the reflection file processed in P31 using detect_twinning.inp in 
cns. The data did not appear to be perfectly merohedrally twinned, but in the 
test for partial merohedral twinning, the twin fraction calculated for "2 along 
a,b" was 0.475.  I repeated rigid body refinement, then torsion angle dynamics 
with restrained group b-factor using the calculated twinning parameters. This 
brought the free R down to 46.3%, but caused significant divergence between 
Rwork and Rfree (Rwork =21.4%(!)).  The Rfree is fairly constant across 
resolution shells, but Rwork drops dramatically with low resolution reflections 
(In the 50 - 9.14 ang shell, Rwork = 12.3%!). 

I'm guessing that because the twinning fraction is near 0.5, detwinning is not 
working. Does anyone have any suggestions about how to successfully refine this 
structure (assuming it is possible)? Should we average the twin related 
reflections to generate perfectly twinned data, and if so, how do we do that? 
Is the twinning likely responsible for our difficulty refining the structure or 
could there be a problem with the space group assignment? Why does including 
the partial twinning in our refinement cause Rfree and Rwork to diverge so 
dramatically?  Given the trouble I've had so far and the poor quality of the 
data, I'm about ready to give up on this structure, but if anyone has any ideas 
please let me know. 

Reply via email to