Low Resolution GroEL Sub-Tomogram Average
Model Building at Low Resolution
This tutorial walks through refinement of the E. coli groEL chain H (PDB ID 8P4P) from Figure 5 in our paper:
"Extracting Information from Low Resolution Data"
The goal is to show how ROCKET can be used to model build a chain in a cryoEM model where the map is particularly low resolution or noisy.

Collect the necessary files
We have prepared ROCKET inputs for download at https://zenodo.org/uploads/15084558.
Download and decompress the file:
You will see a folder organized in the following manner:
For reproducibility, we have prepared all the necessary files in the 8p4pH_preprocessing_outputs. Check this cryoEM tutorial and the API for rk.preprocess if you want to do the preparation from scratch.
Refine starting prediction with ROCKET
The preprocessing command will automatically generate two config yaml files for rk.refine, you can very easily run the refinement inside the 8p4pH_preprocessing_outputswith:
detailed settings are:
Refinement trajectory will be saved to ROCKET_outputs
Note: the preprocessed data {file_id}-Edata.mtz is oversampled for cryo-EM/ET, as this helps at the docking stage. This is why we used a lower learning rate in the refinement yaml above. We have now implemented a config.downsample_ratio parameter that when set to config.downsample_ratio=2 should automatically account for this without needing to change the learning rate.
Find the best scoring model
ROCKET will highlight the best scoring model during its refinement trajectory and save the final MSA cluster profile bias and weight tensors as best_feat_weights_H_{best_iteration}.pt and best_msa_bias_H_{best_iteration}.pt that can be used to re-predict the conformation found. The postRBR_{best_iteration}.pdb file can also be accessed directly in the output folder.

Finalize geometry and B-factors
This will have less of an effect at low resolution, but we recommend a brief standard run of refinement (phenix.refine used in the paper) to refine B-factors and polish the geometry of the best scoring model coming straight out of ROCKET .
Note
There is a degree of stochasticity in the gradient descent and per-iteration rigid body refinement protocols. For this reason, there can be slight differences between ROCKET refinements started with the same inputs and parameters.
Last updated