This tutorial walks through refinement of the E. coli groEL chain H (PDB ID 8P4P) from Figure 5 in our paper:
"Extracting Information from Low Resolution Data"
The goal is to show how ROCKET can build a chain into a cryoEM map that is especially low resolution or noisy.
Two distinct subunit conformations can be observed in the GroEL heptameric rings. Here we focus on refinement of chain H as an example of the conformation in the bottom ring.
You will see a folder organized in the following manner:
For reproducibility, we have prepared all the necessary files in 8p4pH_preprocessing_outputs. If you want to generate them from scratch, follow Launch with Your Own Cryo-EM Data and rk.preprocess.
This example uses half maps, which are preferred for our error model. If you already have a predocked model for your own system and only one post-processed map, rk.preprocess can take --map alone. Keep in mind that does not work when ROCKET still needs to search for the docked placement.
2. Run refinement
The preprocessing step generates two YAML files for rk.refine. You can run refinement inside 8p4pH_preprocessing_outputs with:
The settings used in this example are:
The refinement trajectory will be saved to ROCKET_outputs.
Note: the preprocessed data {file_id}-Edata.mtz is oversampled for cryo-EM/ET, as this helps at the docking stage. This is why we used a lower learning rate in the refinement yaml above. We have now implemented a config.downsample_ratio parameter that when set to config.downsample_ratio=2 should automatically account for this without needing to change the learning rate.
3. Find the best-scoring model
ROCKET will highlight the best scoring model during its refinement trajectory and save the final MSA cluster profile bias and weight tensors as best_feat_weights_H_{best_iteration}.pt and best_msa_bias_H_{best_iteration}.pt that can be used to re-predict the conformation found. The postRBR_{best_iteration}.pdb file can also be accessed directly in the output folder.
ROCKET refines chain H into density
4. Finalize geometry and B-factors
This will have less of an effect at low resolution, but we recommend a brief standard run of refinement (phenix.refine used in the paper) to refine B-factors and polish the geometry of the best scoring model coming straight out of ROCKET .
Note
There is some stochasticity in the gradient descent and per-iteration rigid-body refinement steps. Two ROCKET runs started from the same inputs and parameters may still differ slightly.