These are a few questions and answers about PDB_REDO. New questions will be added when necessary. Mail us yours.
New entries are added regularly but it takes some time. If you are in a hurry, please e-mail us. There are a number of reasons why certain entries cannot be made at all. Our error annotation server WHY_NOT will tell you why.
We cannot redo without experimental data. If you can convince the structure depositor to submit the experimental data to the PDB, we will add an optimised structure model.
Fortunately not. There are a lot of high-quality structure models in the PDB that cannot be improved by means of automatic optimisation. Unfortunately, there are also a few structure models that are beyond repair. In any case, our script is set up so that the 'conservatively optimised' structure model is never worse than the original PDB entry. In a limited number of cases the fully optimised structure model is much worse than the original. Be sure to check the R-factors and the validation scores when you use a PDB_REDO entry.
Deviations of a few percent are quite common. Here are a few common reasons:
We can calculate an expected R-free/R ratio for unbiased refinement using an adapted version of Tickle et al. (Acta Cryst (1998), D54, 547-557). Multiplying R with this ratio gives us an expected R-free value. Based on Tickle et al. (Acta Cryst (2000), D56, 442-450) we can
also calculate the R-free uncertainty σR-free. So we can now express the difference between the expected R-free and the calculated R-free in units of σR-free: a Z-score.
Ideally this score should be close to zero. Positive values indicate that there may be room for improvement of the structure, that convergence was not yet reached in the refinement, or that R-free was extremely biased (e.g. when a new R-free set was selected, or the wrong set deposited). Negative values indicate a problem with the structure model caused by specific errors or by overrefinement.
Please note that the Z-score may be unreliable for low resolution (2.8Å or lower) structure models or
when the R-free set is very small.
The PDB_REDO script is continuously updated and new entries are re-refined with the latest PDB_REDO version (a changelog can be found in the PDB_REDO software). Existing entries will be updated eventually, but because an enormous amount of CPU time is required, this may take a while. If you need a series of structure models parsed with the same version of PDB_REDO, just ask.
This means that the values values should not be used because the re-refinement did not improve the structure model. The values calculated from the data may be severely biased when there is something wrong with the R-free set. The values after re-refinement can only be used safely if the structure has changed. That is, if the structure was refined to convergence.
The validation is done on the structure that comes out of Refmac. In the case the re-refinement did not work, this is the entry obtained after PDB_REDO tried to reproduce the original refinement results. This may be slightly different than the original. For instance, some atoms are removed. The structure may also have been subjected to rigid-body refinement or 'pure' TLS refinement, if the original refinement could not be reproduced in the first attempt. Anyway, the WHAT_CHECK scores are correct for the structure you download from PDB_REDO.
In crystallographic terms, an atom is an atom, no matter what its label. That is why Refmac writes out every atom as ATOM. HETATMs are purely administrative and the flags are typically added by the PDB at deposition. Most programs that use PDB files do not care about the labels. If yours does, then perhaps you should consider making it less rigid ;-). Of course, a (long) Perl one-liner can fix this problem and change the offending ATOM records back to HETATM records. If we get more than 20 complaints about this 'bug', such a one-liner will be added to the script. At the moment the counter stands on 1 (+ a complaint we found on a web forum).
Let's just call this an undocumented feature. We will try to solve this properly at some point. If you really need the Z-value, please ask me to write a work-around.
This is the result of a bug in PDB_REDO or a problem that we do not yet catch. We do check for entries like these, but we missed this one. You should complain about it.