Hi all,
I am new to the process of refinement, so excuse me in advance if I
don't provide all the necessary details.
I collected three data sets of a particular protein in complex with
three different substrates (one protein bound to one substrate for
each structure). One of the crystals diffracted very well to 1.7A,
and I was able to solve the structure in the space group P212121
without problems.
However, the other two crystals diffracted to 2.3A, and although I
was able to scale them in the space group P21, both were highly
mosaic (one was 1.4, the other 1.7). There are plenty of existing
solutions for this protein, so used molecular replacement with AMORE
or PHASER to build the initial model. Both programs seemed to find
the same solution, with two molecules in the asymmetric unit.
However, my problems begin when I try to build solvent with ARPWARP
using ARP/REFMAC cyles- the Rfactor seems to drop to a value of
24-26, but Rfree never goes below 34. The extent by which Rfactor
drops correlates with the weighting factor I choose- I try to stick
with an appropriate factor to yield rms(bond) 0.015 and rms(angle)
1.5. Regardless, though I cannot see clear density for the
substrate, the initial models in general look very good when loaded
in COOT and fit the density well. Subsequent rounds of refinement
lead only to drops in Rfactor, while Rfree stays around 33-34. I
realize that the model is the important aspect of refinement (i.e.
it's not all about statistics), but I am worried that there is
something fundamentally wrong here. When I calculate the
Ramachandran plots in COOT they look good, so my model is realistic
in terms of its chemical properties. Bottom line, I cannot see the
substrate density and the Rfree is not decreasing... or
alternatively, Rfactor IS decreasing while Rfree is not.
I am wondering if there are any ways to troubleshoot the refinement
process. It is possible that the substrate is not bound in the
active site, or that the data is just poor in quality. Also, the
number of unique reflections in these data sets is 15000, so the
percentage of reflections used to calculate Rfree must be over 5% in
order to get the recommended amount of 1000 (so I have read).
Whether I choose 1000 reflections to set aside for Rfree or less, I
still run into the same problem described above.
Any advice would be greatly appreciated!
Keith
Department of Biochemistry & Molecular Pharmacology
970L Lazare Research Building
University of Massachusetts Medical School
364 Plantation Street
Worcester, MA 01605