[ccp4bb] Archiving for fraud detection

2011-11-04 Thread Chris Morris
One argument for archiving images has been that reprocessing could demonstrate 
deliberately deceptive structures.

In fact, what is needed for this is not necessarily the image. It is the last 
data file that was produced by a trusted computer. If the structure depends on 
mtz files produced at the synchrotron, then it is sufficient to authenticate 
the reduced data. The images are only needed for this purpose if they have been 
reprocessed.

regards,
Chris

Chris Morris  
chris.mor...@stfc.ac.uk   
Tel: +44 (0)1925 603689  Fax: +44 (0)1925 603634
Mobile: 07921-717915
Skype: chrishgmorris
http://pims.structuralbiology.eu/
http://www.citeulike.org/blog/chrishmorris
Daresbury Lab,  Daresbury,  Warrington,  UK,  WA4 4AD



Re: [ccp4bb] MR - small coiled coil, 1.65A = 1.000 solutions, all of them wrong

2011-11-04 Thread Sergei Strelkov

Dear Napoleão,

Thank you for updating everyone on your
efforts, and also acknowledging the advice.

I wanted to respond to your question regarding maps.
I know that many people who try to figure out
whether or not their MR solution is the right one
would ask the same question.

So first of all if you wonder why you actually get very
decently looking maps the answer is a classical one:
because 'the phases are more important than amplitudes'.
The appearance of your map is defined by
your model phases, and hence a good match between the
model and the map /may not/ be taken as
a sign of a correct solution. Once again: /never ever!/

On the contrary, at least in your coil1.jpg
image I clearly see that the density exactly follows
the model which is almost entirely poly-Ala.
Unless your protein is really poly-Ala this should be
alarming. If you had a correct solution they you
would hope to see the (difference)density for
at least some missing side chains.

And a second point. Unless your model contains
the complete chain (which is rarely the case, especially
for the coiled coils, as discussed already)
a sign of the correct solution would be the appearance
of extra density near the N- and/or C-terminus of
the model. If it is not there, it is almost certainly not
a solution.

And you should not be worried about the R-factors being
very high at this stage. If the solution is correct then
you should see at least some extra features in the map.

Kind regards,
Sergei



Thank you all for the replies.

Sorry for taking so long to reply, I was actually trying some of your
interesting ideas (and I'm still trying).

I tried using the low resolution data sets for the molecular replacement
(thanks to Yuriy Patskovsky), I also improved and increased my coiled
coil database and employed it in many approaches using EPRM (interesting
program I was not aware of), which I found to produce lots of data,
hopefully addressing at some extent the helixes bent (thanks to Bernhard
Rupp). I also tried some more tweaking in Phaser, although not sure if
did it properly (thanks to Randy Read).

There is no twinning as far as I can tell (thanks to Ed Pozharski for
the tip). Using a data set with enough completeness (360 degrees @
Brookhaven) and processing in P1 did not help me because in this space
group there is most likely 2-3 helixes in the asymmetric unit, which
complicates the problem (and it takes a lot of time for Phaser to run).
Automated approaches also did not yield a better result (as far as I can
tell). I'm convinced that the space group is C2221, but I may be wrong.

Thanks to Sergei Strelkov for the numerous useful suggestions on how to
approach the problem.

One of the big issues for me is to discriminate between a lot of
similarly good density maps. For example:

http://www.fullonline.org/coils/coil1.jpg
http://www.fullonline.org/coils/coil2.jpg

I have hundreds of solutions like these and I think they are all wrong.

I couldn't manage to run Arcimboldo, could not find a tutorial on it
either. It was highly recommended here (and elsewhere), so I'm
definitely willing to give it a try (thanks Isabel Uson).

You guys opened my eyes about a series of issues that I should learn
about and approach, I'm most thankful for that.
Best regards,
   Napo



--
Prof. Sergei V. Strelkov
Laboratory for Biocrystallography
Department of Pharmaceutical Sciences, Katholieke Universiteit Leuven
O&N2, Campus Gasthuisberg, Herestraat 49 bus 822, 3000 Leuven, Belgium
Work phone: +32 16 330845  Fax: +32 16 323469 OR +32 16 323460
Mobile: +32 486 294132
Lab pages: http://pharm.kuleuven.be/anafar



Re: [ccp4bb] MR - small coiled coil, 1.65A = 1.000 solutions, all of them wrong

2011-11-04 Thread George M. Sheldrick
We have unintentionally discovered a very simple way of telling whether
an MR solution is correct or not, provided that (as in this case) native
data have been measured to about 2.1A or better. This uses the current
beta-test of SHELXE that does autotracing (available on email request).

First rename the PDB file from MR to name.pda and generate a SHELX 
format file name.hkl, e.g. using Tim Gruene's mtz2hkl, where 'name' may
be chosen freely but should be the same for both input files. Then run 
SHELXE with a large number of autotracing cycles (here 50), e.g.

shelxe name.pda -a50 -s0.5 -y2

-s sets the solvent content and -y a resolution limit for generating
starting phases. If the .hkl file contains F rather than intensity the
-f switch is also required.

If the model is wrong the CC value for the trace will gradually
decrease as the model disintegrates. If the model is good the CC will
increase, and if it reaches 30% or better the structure is solved. In
cases with a poor but not entirely wrong starting fragment, the CC may
vary erratically for 10-30 cycles before it locks in to the correct 
solution and the CC increases over three or four cycles to the value
for a solved structure (25 to 50%). The solution with the best CC is
written to name.pdb and its phases to name.phs for input to e.g. Coot.

George

On Fri, Nov 04, 2011 at 10:42:27AM +0100, Sergei Strelkov wrote:
> Dear Napoleão,
> 
> Thank you for updating everyone on your
> efforts, and also acknowledging the advice.
> 
> I wanted to respond to your question regarding maps.
> I know that many people who try to figure out
> whether or not their MR solution is the right one
> would ask the same question.
> 
> So first of all if you wonder why you actually get very
> decently looking maps the answer is a classical one:
> because 'the phases are more important than amplitudes'.
> The appearance of your map is defined by
> your model phases, and hence a good match between the
> model and the map may not be taken as
> a sign of a correct solution. Once again: never ever!
> 
> On the contrary, at least in your coil1.jpg
> image I clearly see that the density exactly follows
> the model which is almost entirely poly-Ala.
> Unless your protein is really poly-Ala this should be
> alarming. If you had a correct solution they you
> would hope to see the (difference)density for
> at least some missing side chains.
> 
> And a second point. Unless your model contains
> the complete chain (which is rarely the case, especially
> for the coiled coils, as discussed already)
> a sign of the correct solution would be the appearance
> of extra density near the N- and/or C-terminus of
> the model. If it is not there, it is almost certainly not
> a solution.
> 
> And you should not be worried about the R-factors being
> very high at this stage. If the solution is correct then
> you should see at least some extra features in the map.
> 
> Kind regards,
> Sergei
> 
> 
> 
> Thank you all for the replies.
> 
> Sorry for taking so long to reply, I was actually trying some of your
> interesting ideas (and I'm still trying).
> 
> I tried using the low resolution data sets for the molecular replacement
> (thanks to Yuriy Patskovsky), I also improved and increased my coiled
> coil database and employed it in many approaches using EPRM (interesting
> program I was not aware of), which I found to produce lots of data,
> hopefully addressing at some extent the helixes bent (thanks to Bernhard
> Rupp). I also tried some more tweaking in Phaser, although not sure if
> did it properly (thanks to Randy Read).
> 
> There is no twinning as far as I can tell (thanks to Ed Pozharski for
> the tip). Using a data set with enough completeness (360 degrees @
> Brookhaven) and processing in P1 did not help me because in this space
> group there is most likely 2-3 helixes in the asymmetric unit, which
> complicates the problem (and it takes a lot of time for Phaser to run).
> Automated approaches also did not yield a better result (as far as I can
> tell). I'm convinced that the space group is C2221, but I may be wrong.
> 
> Thanks to Sergei Strelkov for the numerous useful suggestions on how to
> approach the problem.
> 
> One of the big issues for me is to discriminate between a lot of
> similarly good density maps. For example:
> 
> http://www.fullonline.org/coils/coil1.jpg
> http://www.fullonline.org/coils/coil2.jpg
> 
> I have hundreds of solutions like these and I think they are all wrong.
> 
> I couldn't manage to run Arcimboldo, could not find a tutorial on it
> either. It was highly recommended here (and elsewhere), so I'm
> definitely willing to give it a try (thanks Isabel Uson).
> 
> You guys opened my eyes about a series of issues that I should learn
> about and approach, I'm most thankful for that.
> Best regards,
>   Napo
> 
> 
> 
> --
> Prof

Re: [ccp4bb] Archiving for fraud detection

2011-11-04 Thread Zhijie Li
If the data files generated from trusted computers carry digital signatures 
it would be more trustworthy. Otherwise, a person with proper knowledge can 
still manipulate the data files, even if it is in binary. If the image 
processing software routinely incorporate encrypted key information of the 
original data to the final data files, then data from any computer can be 
trusted. This would be best considering that in real life we often have to 
combat the ice rings, splitting reflections, etc., at home.


For example, if the frames used for indexing the dataset are encrypted and 
saved with or within the final data file as "proof of experiment", in a 
universal format that can be used by the structure deposition servers to 
verify the reported space group, resolution, and, to some degree, the 
structure itself, it would probably serve the purpose.


Zhijie

--
From: "Chris Morris" 
Sent: Friday, November 04, 2011 4:09 AM
To: 
Subject: [ccp4bb] Archiving for fraud detection

One argument for archiving images has been that reprocessing could 
demonstrate deliberately deceptive structures.


In fact, what is needed for this is not necessarily the image. It is the 
last data file that was produced by a trusted computer. If the structure 
depends on mtz files produced at the synchrotron, then it is sufficient to 
authenticate the reduced data. The images are only needed for this purpose 
if they have been reprocessed.


regards,
Chris

Chris Morris
chris.mor...@stfc.ac.uk
Tel: +44 (0)1925 603689  Fax: +44 (0)1925 603634
Mobile: 07921-717915
Skype: chrishgmorris
http://pims.structuralbiology.eu/
http://www.citeulike.org/blog/chrishmorris
Daresbury Lab,  Daresbury,  Warrington,  UK,  WA4 4AD




Re: [ccp4bb] Archiving for fraud detection

2011-11-04 Thread James Stroud

On Nov 4, 2011, at 2:09 AM, Chris Morris wrote:

> One argument for archiving images has been that reprocessing could 
> demonstrate deliberately deceptive structures.
> 
> In fact, what is needed for this is not necessarily the image. It is the last 
> data file that was produced by a trusted computer.


Although this is a good idea from the perspective of storage, it is difficult 
to implement. 

For this idea to work, you need a (1) certificate system, (2) certificate 
authority. The certification is necessary to verify that the data file was 
indeed generated by a trusted computer. The chosen file needs to be certified 
by the authority and the certification archived on a trusted system. None of 
these requirements are terribly problematic. The infrastructure for a 
certificate system is free in the form of openSSL. Almost any lab or 
institution could easily become a certificate authority. The storage 
requirements for the certificates are trivial. For example, if a certificate 
were 2 KB, then, for the 8,000 structures per year, the storage requirements 
would be 1.6 MB. After 1000 years, we would fill up my $14.95 2 GB thumb drive.

The difficulty is that certification should be done on the file before it is 
transferred from the trusted computer. This requires inserting the 
certification process somewhere in the transfer pipeline, which is difficult 
because it requires all the synchrotrons to actually implement it. Allowing the 
user to produce the certificate after transfer is as useful as having no 
certificate system at all.

Then there is the issue of data collection on a home source.

James



[ccp4bb] Call for submissions to Computational Crystallography Newsletter

2011-11-04 Thread Nigel Moriarty
Calling for articles and short communications of interest tostructural
biologists. The deadline for publication in the January2012 issue is
1st Dec, 2011.

The Computational Crystallography Newsletter (CCN) is an
electronicnewsletter for structural biologists, and is published
online every 6months. Feature articles, meeting announcements and
reports can besubmitted to me at any time for consideration.
Submission of text byemail or word-processing files using the CCN
templates is requested.Past newsletters and the template are available
atwww.phenix-online.org/newsletter.

Cheers

Nigel
-- 
Nigel W. Moriarty
Building 64R0246B, Physical Biosciences Division
Lawrence Berkeley National Laboratory
Berkeley, CA 94720-8235
Phone : 510-486-5709     Email : nwmoria...@lbl.gov
Fax   : 510-486-5909       Web  : CCI.LBL.gov


Re: [ccp4bb] atomic scattering factors in REFMAC

2011-11-04 Thread Ivan Shabalin
Im very grateful to the community for supporting me with so interesting 
information!

Now my understanding of the subject is much better!


With best regards,
Ivan Shabalin


[ccp4bb] Xenon Derivatization

2011-11-04 Thread Brennan Bonnet
Hi everyone,

My name is Brennan Bonnet and I am doing my Master’s project on SAD phasing of 
proteins using xenon gas.  

I plan on doing several proteins but first I am trying to get everything 
working smoothly on lysozyme since it is easy to grow, diffracts well, and is 
already known to bind xenon atoms (PDB entry 1C10, 3 sites @ 8bar pressure).

Put simply, my method involves growing the crystals, mounting them in 
cryoloops, cryoprotection (if required), pressurization by xenon gas using the 
Hampton Xenon Chamber and quickly freezing them using liquid nitrogen with only 
a couple of seconds between depressurization and freezing.  

I have chosen pressures and durations of pressurization based on previous work 
which indicates that suitable derivatives may be produced using pressures 
between 1-100 bar and that binding should be complete within minutes.  (see Use 
of Noble Gases Xenon and Krypton as Heavy Atoms in Protein Structure 
Determination by M. Schiltz, R. Fourme, and Thierry Prangé for a summary).

Therefore I have chosen to use pressures between 7-28bar (100-400psi) for a 
duration of 15 minutes.  After processing using XDS and solving with Phenix, 
the results show occupancies <0.1 which indicate no xenon binding or at least 
nothing that “sticks out”.  The general trend is that increased xenon pressure 
results in a stronger anomalous signal and I have attached a table below with 
some processed data from XDS.

I plan on trying a couple of other things.  I have been collecting at 7keV but 
plan to try 6keV in order to get closer to the xenon L-edges and get a better 
anomalous signal.  I also plan on trying longer pressurization times up to an 
hour to hopefully get better occupancy.

Has anyone had any success with this method or any sort of “Aha!” moment? Help 
in the matter would be much appreciated.

Set   Pressure  Energy  Anom MaxAnom Total  Anom 
Max / Anom Total  Rmeas   Resolution  Last shell I/σ(I)
(psi)(keV)  
 (%)
 (Å)
Sep7100  7 2.111  1.352 
   1.56139053342.03 
 13.9
Sep7160  7 1.935  1.244 
   1.5554662384.7  2.03 
 12.17
Sep7200 7  2.454  1.293 
   1.8979118333.4  2.03 
 11.01
Jul28   200 7  3.13   
1.7491.7895940542.2  2.03   
   17.6
Sep7280 7  2.375  1.54  
   1.5422077923.1  2.03 
 14.33
Jul28   400 7  3.082  1.649 
   1.8690115223.5  2.03 
 14.29


Thanks,

~Brennan~