Hi Julia, For a table 1 you should make a sensible split of the atoms over which you calculate the mean. You might need to pool certain chains. There is not really convenient tool for that because the choice depends on the biology/biochemistry of your system. In practice, the easiest way is using the command line on just the ATOM/HETATM records (atom_site in mmCIF format) in which you are interested. Calculating the mean of a column of values is pretty straightforward in awk. Example for all atoms in a PDB file: grep ^[HA][TE][OT][MA] 100d_final.pdb | cut -c 61-66 | awk '{sum = sum + $1} END {print sum/NR}'
Or mmCIF: grep [HA][TE][OT][MA] /DATA/pdb_redo/00/100d/100d_final.cif | awk '{sum = sum + $16} END {print sum/NR}' HTH, Robbie From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Julia Griese Sent: Thursday, November 26, 2020 15:11 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Mean B factors and number of atoms (Refmac/baverage) Hi all, I’m writing a Table 1 and getting a bit confused when it comes to number of atoms and average B factors. Refmac has these in the table in the GUI, but the atom numbers in that table seem to include H, and I’m only interested in non-H atoms. As an example, the PDB file says: REMARK 3 NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT. REMARK 3 ALL ATOMS : 8351 Which agrees with the total count minus TER cards, so that seems to be correct. However, the table in the GUI for this refinement run looks like this: Chain mean B<br/>(No. atoms) AAA 41.4( 2193 ) BBB 57.7( 3499 ) CCC 57.7( 3499 ) DDD 41.7( 2212 ) EEE 60.3( 923 ) FFF 60.6( 920 ) aaa 55.4( 1323 ) ddd 56.0( 1346 ) GGG 34.3( 1 ) GaG 42.7( 1 ) GbG 34.3( 1 ) GcG 40.1( 1 ) GdG 40.6( 1 ) GeG 35.8( 1 ) GfG 34.2( 1 ) GgG 43.2( 1 ) HHH 40.6( 136 ) You can easily see that this adds up to a lot more than 8351 atoms. The numbers for the G chain (metal ions) and the H chain (water) are correct, whereas the numbers for the macromolecule chains appear to include H. (If I run a refinement with H output to the final file, I get approximately the same number of atoms in total, though not quite.) But what I’m really interested in is of course the number of non-H atoms per chain. I don’t want to count all the atoms by hand… I used to use baverage to calculate average B factors (and that would also give me the number of non-H atoms per chain), but can’t get that to work on the command line and can’t find it in the i2 GUI. I don’t have the old ccp4i anymore. So if anyone could either tell me how to get baverage to work, or if there is another way to extract these numbers, I would much appreciate it! Best, Julia -- Dr. Julia Griese Assistant Professor Department of Cell and Molecular Biology Uppsala University BMC, Box 596 SE-75124 Uppsala Sweden email: julia.gri...@icm.uu.se<mailto:julia.gri...@icm.uu.se> phone: +46-(0)18-471 4043 http://www.icm.uu.se/structural-biology/griese-lab/ När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/ E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy ________________________________ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/