Frequently Asked Questions



These are a few questions and answers about PDB_REDO. New questions will be added when necessary. Mail us yours.


I'm missing entry 9xyz. Why didn't you add it?

New entries are added regularly but it takes some time. If you are in a hurry, please e-mail us. There are a number of reasons why certain entries cannot be made at all. Our error annotation server WHY_NOT will tell you why.

What if there is no experimental data?

We cannot redo without experimental data. If you can convince the structure depositor to submit the experimental data to the PDB, we will add an optimised structure model.

Do all structures improve when run through PDB_REDO?

Fortunately not. There are a lot of high-quality structure models in the PDB that cannot be improved by means of automatic optimisation. Unfortunately, there are also a few structure models that are beyond repair. In any case, our script is set up so that the 'conservatively optimised' structure model is never worse than the original PDB entry. In a limited number of cases the fully optimised structure model is much worse than the original. Be sure to check the R-factors and the validation scores when you use a PDB_REDO entry.

Why is there a difference between the recalculated R(-free) and the value from the PDB header?

Deviations of a few percent are quite common. Here are a few common reasons:

Since PDB_REDO version 1.8 we follow this rule: If the R-factor from the header cannot be reproduced (with a tolerance of 0.10 or 10%), the optimisation is aborted and no PDB_REDO entry is made. Every once in a while we check these problematic structures by hand.

What is the R-free Z-score?

We can calculate an expected R-free/R ratio for unbiased refinement using an adapted version of Tickle et al. (Acta Cryst (1998), D54, 547-557). Multiplying R with this ratio gives us an expected R-free value. Based on Tickle et al. (Acta Cryst (2000), D56, 442-450) we can also calculate the R-free uncertainty σR-free. So we can now express the difference between the expected R-free and the calculated R-free in units of σR-free: a Z-score.
Ideally this score should be close to zero. Positive values indicate that there may be room for improvement of the structure, that convergence was not yet reached in the refinement, or that R-free was extremely biased (e.g. when a new R-free set was selected, or the wrong set deposited). Negative values indicate a problem with the structure model caused by specific errors or by overrefinement.
Please note that the Z-score may be unreliable for low resolution (2.8Å or lower) structure models or when the R-free set is very small.

Why do the entries have such different version numbers?

The PDB_REDO script is continuously updated and new entries are re-refined with the latest PDB_REDO version (a changelog can be found in the PDB_REDO software). Existing entries will be updated eventually, but because an enormous amount of CPU time is required, this may take a while. If you need a series of structure models parsed with the same version of PDB_REDO, just ask.

Why do I get 'NA' for σR-free and the R-free Z-score after re-refinement for entry 9xyz?

This means that the values values should not be used because the re-refinement did not improve the structure model. The values calculated from the data may be severely biased when there is something wrong with the R-free set. The values after re-refinement can only be used safely if the structure has changed. That is, if the structure was refined to convergence.

Why are the WHAT_CHECK scores (slightly) different before and after re-refinement when re-refinement did not 'change' the structure?

The validation is done on the structure that comes out of Refmac. In the case the re-refinement did not work, this is the entry obtained after PDB_REDO tried to reproduce the original refinement results. This may be slightly different than the original. For instance, some atoms are removed. The structure may also have been subjected to rigid-body refinement or 'pure' TLS refinement, if the original refinement could not be reproduced in the first attempt. Anyway, the WHAT_CHECK scores are correct for the structure you download from PDB_REDO.

Why are all HETATMs converted to ATOMs?

In crystallographic terms, an atom is an atom, no matter what its label. That is why Refmac writes out every atom as ATOM. HETATMs are purely administrative and the flags are typically added by the PDB at deposition. Most programs that use PDB files do not care about the labels. If yours does, then perhaps you should consider making it less rigid ;-). Of course, a (long) Perl one-liner can fix this problem and change the offending ATOM records back to HETATM records. If we get more than 20 complaints about this 'bug', such a one-liner will be added to the script. At the moment the counter stands on 1 (+ a complaint we found on a web forum).

Why are the Z-values in the CRYST1 card missing?

Let's just call this an undocumented feature. We will try to solve this properly at some point. If you really need the Z-value, please ask me to write a work-around.

What happened to entry 9xyz? PDB_REDO completely destroyed the structure model.

This is the result of a bug in PDB_REDO or a problem that we do not yet catch. We do check for entries like these, but we missed this one. You should complain about it.