We recently published a method for estimating F(000) by simply equating
it to Fcalc(000), the sum of all the electrons in the refined model
(including those in the bulk solvent):
http://dx.doi.org/10.1073/pnas.1302823110
The results of this procedure compared favorably with the two
experimentally determined F(000) values we could find in the literature:
http://dx.doi.org/10.1073/pnas.0806307105
http://dx.doi.org/10.1073/pnas.0609442104
and one more case where we used the average electron density of an MD
simulation (a situation where we definitely know how many electrons were
in the unit cell). All three "errors" were less than 10%, which I don't
think is so bad, considering that Fobs and Fcalc in general don't agree
to better than 20%.
So, if what you are after is the "best fit" electron density of the bulk
solvent, you will find this as the "scale factor" for "Partial
structure 1:" in your REFMAC log. With phenix.refine just look for
"kmask" (or "k_sol" in versions prior to 1.8.x). We did this for 834
PDBs and found k_sol ranged from 0.095 (1zqd) to 0.6 (1cxq) with median
0.429 +/- 0.05.
Getting the volume of the bulk solvent mask is, however, a lot more
tricky. This is because its hard to know where "ordered solvent" ends
and disordered "bulk solvent" begins. It is NOT generally what you
would expect from simply using the "average density of protein".
Fortunately, refinement programs are pretty good at finding ways to
explain all the electron density, one way or another. REFMAC has a
secret feature where if you specify an MSKOUT on the command line it
will write out a map file of the bulk solvent mask. Then you can just
take the average value of this map (voxels are either one or zero) to
get the volume fraction of solvent.
With phenix.refine you need to extract the optimized k_sol, r_solv and
r_shrink parameters from the log file and then re-create the full Fcalc
map with phenix.fmodel. Then you make the same map again with ksol=0
and subtract that from the first map to get a map of just the bulk
solvent density. Unfortunately, this map must be calculated from
structure factors, so it has a mean value of zero and the lowest value
of the map is always going to be a little below "vacuum" because of
series-termination effects. You therefore have to look at the histogram
of density values to find the centroid of "negative" values (the protein
region with protein subtracted, or "vacuum") and see how far below
"zero" it is. If you take this "vacuum level shift" and multiply by the
unit cell volume, you recover the total number of electrons in the bulk
solvent model. Add this to the sum of the atomic numbers of all the
atoms in the coordinate file (weighting by occupancy), and you get
Fcalc(000). This is what we did in the above reference, but I imagine a
less convoluted path to Fcalc(000) may emerge soon.
But, to answer your question about density calculations:
for pure water:
1.0 g/cm^3
* (100 cm/ 1e10 A)^3
/ (18.015 g/mol)
* 6.02214e23 molecules/mol
* 10 electrons/molecule
= 0.334277 electrons/A^3
Another way to do it is to realize that 1 M of anything is 6.02214e-4
molecules/A^3 and pure water has a molar density of:
1.000 g/cm^3 / (18.0015 g/mol) * 1000 cm^3/L
= 55.509 mol/L.
Then you can derive the electron density based on concentrations:
55.5 M oxygen * 6.02214e-4 atoms/A^3/M * 8 electrons/atom
= 0.2674 electrons/A^3
111 M hydrogen * 6.02214e-4 atoms/A^3/M * 8 electrons/atom
= 0.0668 electrons/A^3
which sum to 0.3343 electrons/A^3
This opens the door to computing the electron density of more complex
mixtures, such as 4M (NH4)2SO4:
4 M * 6.02214e-4 molecule/A^3/M * 70 electrons/molecule
= 0.16862 electrons/A^3
and the rest comes from the water.
Here is where we get into trouble. What is the molar concentration of
water in a 4 M solution of (NH4)2SO4? Do you just replace one mole of
water for each mole of ion? Or do it by mass? What about
electrostriction? It is a common classroom demonstration that
dissolving salt in water reduces the total volume. The volume change
upon dissolution is very salt-specific and also not all that linear with
multi-component solutions. If you want to be sure, you need to measure
the mass density experimentally, as the second and third references
above did.
However, if you're not all that concerned by 5-10% error, you can
probably just add the volumes of all the substances being mixed and take
that as the final volume. For example, the density of solid (NH4)2SO4
can be calculated very accurately from its crystal structure:
unit cell: 7.924 10.526 5.953 90 90 90 (496.53 A^3)
Z = 4 formula units/cell (space group 33)
132.134 g/mol
* 4 molecules/cell
/ (496.53 A^3)
/ 6.02214e23 molecules/mol
/ (100 cm/ 1e10 A)^3
= 1.7676 g/cm^3
So the solid (NH4)2SO4 dissolved up to to make 1 cm^3 of 4 M solution
weighed:
4 mol/L * 1 cm^3 * 132.134 g/mol /(1000 cm^3/L) = 0.52854 g
and took up a volume of:
0.52854 g / (1.7676 g/cm^3)
= 0.29901 cm^3
The remaining volume must therefore be:
1 - 0.29901 cm^3
= 0.7010 cm^3 of water
0.7010 cm^3 of water weighs 0.7010 g, so the molarity of 0.7010 g/cm^3
of water is:
0.7010 g/cm^3 / (18.0015 g/mol) * 1000 cm^3/L
= 38.94 M water
Now we can use the "1 M of anything" rule from above!
38.94 M * 6.02214e-4 molecule/A^3/M * 10 electrons/molecule
= 0.2345 electrons/A^3
which we now sum with the electron density of 4 M (NH4)2SO4 as a "gas" to:
0.2345 + 0.16862
= 0.4031 electrons/A^3
for a 4 M aqueous solution of (NH4)2SO4. Pretty close to Michael's
rumored value of 0.41 electrons/A^3.
Note that the calculations above predict a total mass density for 4 M
(NH4)2SO4 in water as:
(0.52854 g + 0.7010 ) / 1.0 cm^3
= 1.2295 g/cm^3
or, more succinctly:
rho_calc(M) = 1+(M*132.134/1000)*(1.0-1.0/1.7676)
But the Sigma website, where you can buy (NH4)2SO4 solutions lists other
densities:
M rho rho_calc %error
1 1.07 1.0574 1.2
2 1.13 1.1148 1.3
4.1 1.235 1.2295 0.4
which makes me think the last one just reported a density calculated the
way I just did it, without actually measuring it. It's actually hard to
find measured densities on the web. Nevertheless, 1.3% error is
probably plenty good for estimating electron density for your bulk
solvent. The error in the volume of the solvent region is going to be a
lot bigger than that. Of course, given that solutes like Na2S actually
take up a "negative" volume, concentrated solutions of it may be way
more "off" than good ol' (NH4)2SO4. It can also go the other way, more
hydrophobic ions make the water around them "order up", and become less
dense. Much the way protein does.
Now one may wonder if the bulk density of the "solvent" may be in any
way relevant to the "nanoconfined" solvent in the channels of a protein
crystal. The concentration of protein inside the crystal is generally
around 500 g/L. Might that be enough for the "volume change on
dissolution" to be important? That's an excellent question. I have no
idea.
This is why our ancestors used to measure the density of their protein
crystals in an gradient of organic solvents. Iodobenzene and
ethylbenzene I think were the components of choice. Hard to get those
past safety these days.
And then, even if you do accurately measure the mass density of your
protein crystal and subtract out the mass of the protein in the crystal
(using the unit cell and molecular weight), you are still left with a
bit of a conundrum about how to convert a mass density into an electron
density. You need to know the mole fraction of all the components to do
that. Yes, you may know them in the bulk, but what if something
concentrates in the crystal? Such as chloride ions binding to the
protein? What is their "concentration" then? Anyone who has soaked a
crystal with platinum and seen it turn much pinker than the surrounding
solvent will understand what I mean.
So, in the end, I think the best way to estimate the electron density of
the stuff in the solvent channels is to take the best-fit "scale factor"
of the bulk solvent mask in refinement. Particularly after building in
everything that is "ordered".
In general, when trying to measure a "volume" in a sea of overlapping
peaks, I have found the best thing to do is fit a comprehensive model to
all the data and then extract the relevant component of the model after
the fit has converged. The nice thing about macromolecular refinement
is that the number of electrons in any peak is directly related to the
occupancy of the "atom" you place in it. Error bars you can get by
shaking the model and/or the data and seeing how the parameter you are
trying to measure "jiggles" in response, whether it be a refined
occupancy of a ligand, or the "scale" of the bulk solvent. This is also
what we did in the first reference above to put error bars on the
electron density itself.
-James Holton
MAD Scientist
On Wed, Feb 5, 2014 at 11:05 AM, Michael C. Wiener <mwie...@virginia.edu
<mailto:mwie...@virginia.edu>> wrote:
Starting with the density (or specific volume) of water, you can
obtain rho(water)=0.33 e-/A^3. It's a nifty little bit of arithmetic
and dimensional analysis.
-MW
Michael C. Wiener, Ph.D.
Professor
Department of Molecular Physiology
and Biological Physics
University of Virginia
PO Box 800886
Charlottesville, VA 22908-0886
434-243-2731 <tel:434-243-2731>
434-982-1616 <tel:434-982-1616> (FAX)
On Wed, 5 Feb 2014 10:39:57 -0800
Joseph Noel <n...@salk.edu <mailto:n...@salk.edu>> wrote:
>Dear All,
>
>Is there a way or program that is good at providing a solid
estimate of F000 using the standard mean protein electron density of
0.433 e/A**3 and a mean solvent electron density calculated from
the crystallization conditions? I am wondering how the latter might
affect F000 if the conditions are a bit different then the values
assumed in most programs.
>
>I am "assuming" from these two values I can calculate a pretty
good estimate of F000 using:
>
>F000 = mean_electron_density*Vcell(e-), where
mean_electron_density=fraction_protein*0.433 e/A**3 +
fraction_solvent*mean_solvent_electron_density(e/A**3).
>
>In short, how can one calculate the average solvent density
instead of using the value assumed in many programs of 0.35? I know
that 4M ammonium sulphate is 0.41 and pure water is 0.33 but not
sure how these values are measured (or calculated).
>
>Thanks!
>
>Joe
>______________________________________________________________________________________
>Joseph P. Noel, Ph.D.
>Arthur and Julie Woodrow Chair
>Investigator, Howard Hughes Medical Institute
>Professor, The Jack H. Skirball Center for Chemical Biology and
Proteomics
>The Salk Institute for Biological Studies
>10010 North Torrey Pines Road
>La Jolla, CA 92037 USA
>
>Phone: (858) 453-4100 extension 1442
>Cell: (858) 349-4700
>Fax: (858) 597-0855
>E-mail: n...@salk.edu <mailto:n...@salk.edu>
>
>Publications & Citations:
http://scholar.google.com/citations?user=xiL1lscAAAAJ
>
>Homepage Salk: http://www.salk.edu/faculty/noel.html
>Homepage HHMI: http://hhmi.org/research/investigators/noel.html
>______________________________________________________________________________________
>