Hello all- (Paging Paul and Alexei)
I recently came across a difficulty in Coot and Molrep in parsing PDB files containing insertion residues of the type where the residue number is the same but with an A, B, etc appendage. For example, PDB file 2CMR has several instances of the following type: Chain H: ATOM 2169 N LEU H 82 16.387 -4.529 38.070 1.00 25.36 N ATOM 2170 CA LEU H 82 15.979 -3.226 38.555 1.00 25.57 C ATOM 2171 C LEU H 82 15.316 -3.389 39.922 1.00 26.75 C ATOM 2172 O LEU H 82 14.360 -4.111 40.078 1.00 25.74 O ATOM 2173 CB LEU H 82 15.023 -2.596 37.544 1.00 25.63 C ATOM 2174 CG LEU H 82 14.413 -1.206 37.737 1.00 26.28 C ATOM 2175 CD1 LEU H 82 15.420 -0.136 37.660 1.00 27.77 C ATOM 2176 CD2 LEU H 82 13.343 -0.983 36.660 1.00 28.33 C ATOM 2177 N SER H 82A 15.825 -2.655 40.902 1.00 28.60 N ATOM 2178 CA SER H 82A 15.372 -2.752 42.281 1.00 29.89 C ATOM 2179 C SER H 82A 14.393 -1.640 42.642 1.00 29.70 C ATOM 2180 O SER H 82A 14.078 -0.798 41.806 1.00 29.43 O ATOM 2181 CB SER H 82A 16.596 -2.727 43.180 1.00 29.86 C ATOM 2182 OG SER H 82A 17.038 -4.056 43.307 1.00 34.03 O ATOM 2183 N SER H 82B 13.887 -1.673 43.880 1.00 30.63 N ATOM 2184 CA SER H 82B 12.924 -0.673 44.398 1.00 30.48 C ATOM 2185 C SER H 82B 11.938 -0.229 43.358 1.00 30.29 C ATOM 2186 O SER H 82B 11.818 0.962 43.102 1.00 29.95 O ATOM 2187 CB SER H 82B 13.654 0.570 44.911 1.00 31.02 C ATOM 2188 OG SER H 82B 14.509 0.257 45.981 1.00 32.62 O ATOM 2189 N LEU H 82C 11.231 -1.180 42.758 1.00 31.07 N ATOM 2190 CA LEU H 82C 10.297 -0.869 41.674 1.00 31.04 C ATOM 2191 C LEU H 82C 9.143 0.028 42.133 1.00 31.96 C ATOM 2192 O LEU H 82C 8.499 -0.235 43.149 1.00 31.17 O ATOM 2193 CB LEU H 82C 9.736 -2.138 41.050 1.00 30.89 C ATOM 2194 CG LEU H 82C 10.733 -2.983 40.270 1.00 30.03 C ATOM 2195 CD1 LEU H 82C 10.151 -4.334 40.056 1.00 29.26 C ATOM 2196 CD2 LEU H 82C 11.157 -2.335 38.918 1.00 28.52 C ATOM 2197 N ARG H 83 8.901 1.084 41.361 1.00 32.25 N ATOM 2198 CA ARG H 83 7.794 1.992 41.566 1.00 33.17 C ATOM 2199 C ARG H 83 6.921 1.896 40.344 1.00 32.91 C ATOM 2200 O ARG H 83 7.302 1.285 39.340 1.00 32.53 O ATOM 2201 CB ARG H 83 8.311 3.411 41.795 1.00 33.79 C ATOM 2202 CG ARG H 83 8.658 3.653 43.274 1.00 38.22 C ATOM 2203 CD ARG H 83 9.368 4.995 43.632 1.00 42.93 C ATOM 2204 NE ARG H 83 10.363 5.464 42.651 1.00 45.96 N ATOM 2205 CZ ARG H 83 11.557 4.897 42.402 1.00 47.45 C ATOM 2206 NH1 ARG H 83 11.954 3.792 43.028 1.00 48.29 N ATOM 2207 NH2 ARG H 83 12.359 5.445 41.501 1.00 48.14 N Coot Bug: file reads and displays fine, but when you do real space fitting on any of the insertion residues (say residue 82B), the fitting is actually done on residue 82. Molrep Bug: This one is particularly troublesome, as the file is not parsed correctly, residues end up being re-numbered, and the numbering is not sequential. Example Molrep Output: ATOM 3696 N LEU A 84 7.810 42.393 1.658 1.00 20.00 A N ATOM 3697 CA LEU A 84 9.191 42.617 1.283 1.00 20.00 A C ATOM 3698 C LEU A 84 9.316 42.523 -0.237 1.00 20.00 A C ATOM 3699 O LEU A 84 8.968 41.536 -0.839 1.00 20.00 A O ATOM 3700 CB LEU A 84 10.070 41.578 1.978 1.00 20.00 A C ATOM 3701 CG LEU A 84 11.590 41.538 1.815 1.00 20.00 A C ATOM 3702 CD1 LEU A 84 12.251 42.705 2.420 1.00 20.00 A C ATOM 3703 CD2 LEU A 84 12.123 40.253 2.463 1.00 20.00 A C ATOM 3704 N SER A 85 9.870 43.569 -0.836 1.00 20.00 A N ATOM 3705 CA SER A 85 9.985 43.683 -2.282 1.00 20.00 A C ATOM 3706 C SER A 85 11.376 43.304 -2.777 1.00 20.00 A C ATOM 3707 O SER A 85 12.243 42.955 -1.980 1.00 20.00 A O ATOM 3708 CB SER A 85 9.627 45.107 -2.670 1.00 20.00 A C ATOM 3709 OG SER A 85 8.233 45.144 -2.850 1.00 20.00 A O ATOM 3710 N SER A 86 11.564 43.336 -4.101 1.00 20.00 A N ATOM 3711 CA SER A 86 12.851 42.998 -4.754 1.00 20.00 A C ATOM 3712 C SER A 86 13.564 41.867 -4.074 1.00 20.00 A C ATOM 3713 O SER A 86 14.714 42.019 -3.684 1.00 20.00 A O ATOM 3714 CB SER A 86 13.791 44.206 -4.761 1.00 20.00 A C ATOM 3715 OG SER A 86 13.247 45.274 -5.494 1.00 20.00 A O ATOM 3716 N LEU A 87 12.888 40.733 -3.930 1.00 20.00 A N ATOM 3717 CA LEU A 87 13.456 39.590 -3.213 1.00 20.00 A C ATOM 3718 C LEU A 87 14.708 39.032 -3.897 1.00 20.00 A C ATOM 3719 O LEU A 87 14.719 38.789 -5.104 1.00 20.00 A O ATOM 3720 CB LEU A 87 12.430 38.478 -3.049 1.00 20.00 A C ATOM 3721 CG LEU A 87 11.268 38.789 -2.117 1.00 20.00 A C ATOM 3722 CD1 LEU A 87 10.188 37.795 -2.351 1.00 20.00 A C ATOM 3723 CD2 LEU A 87 11.681 38.825 -0.615 1.00 20.00 A C ...note residue numbering skips 88, but all residues are 'present') ATOM 3724 N ARG A 89 15.754 38.836 -3.097 1.00 20.00 A N ATOM 3725 CA ARG A 89 16.991 38.223 -3.529 1.00 20.00 A C ATOM 3726 C ARG A 89 17.151 36.958 -2.730 1.00 20.00 A C ATOM 3727 O ARG A 89 16.409 36.715 -1.772 1.00 20.00 A O ATOM 3728 CB ARG A 89 18.157 39.187 -3.320 1.00 20.00 A C ATOM 3729 CG ARG A 89 18.323 40.139 -4.518 1.00 20.00 A C ATOM 3730 CD ARG A 89 19.357 41.298 -4.373 1.00 20.00 A C ATOM 3731 NE ARG A 89 19.422 41.922 -3.039 1.00 20.00 A N ATOM 3732 CZ ARG A 89 18.474 42.695 -2.480 1.00 20.00 A C Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.