> For the complete documentation index, see [llms.txt](https://rocket-9.gitbook.io/rocket-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://rocket-9.gitbook.io/rocket-docs/launch-with-your-own-x-ray-data.md).

# Launch with Your Own X-ray Data

This tutorial shows how to refine a prediction against your own **X-ray** dataset with ROCKET.

We use the `PDB ID 1LJ5` system as an example.

{% hint style="info" %}
This path is best when you already have experimental data, a target sequence, and precomputed alignments ready.
{% endhint %}

**Note:** Precompute your MSA files first. ROCKET currently expects `a3m` or `sto` input from an external server or database. To use OpenFold locally, follow the [sequence database download instructions](https://openfold.readthedocs.io/en/latest/Inference.html). This requires about a terabyte of storage. The `--precomputed_alignment_dir` flag defaults to `alignments/`, and ROCKET will use all alignments found there.

### 1. Collect the required files

The ROCKET preprocessing script expects input files organized as follows:

```
<working_directory>/
├── {file_id}_fasta/
│   └── {file_id}.fasta       # FASTA file containing the chain to refine
│                             # Header should be "> {file_id}"
│
├── {file_id}_data/
│   ├── *.mtz                 # For X-ray data
│   └── <optional files>/     # e.g., predicted or docked models
│
├── alignments/               # (default: --precomputed_alignment_dir)
│   └── {file_id}
|       └──*.a3m / *.hhr      # MSA files for the input sequence  
```

### 2. Run preprocessing

Once your files are organized, run `rk.preprocess`:

```bash
rk.preprocess \
  --file_id 1lj5 \
  --method xray \
  --output_dir ./1lj5_processed \
  --precomputed_alignment_dir alignments/ \
  --max_recycling_iters 20 \
  --use_deepspeed_evoformer_attention
```

**Note:** If the MTZ file contains more than one useful column set, Phaser will choose the best one automatically. In most cases, intensities are preferred if they are available. If you want to force a specific column set, provide an MTZ file that only contains that data. You can inspect the file with:

```bash
rs.mtzdump {file_id}_data/xxx.mtz
```

### 3. Review the generated configs

`rk.preprocess` generates two YAML files under `--output_dir` for `rk.refine`. Review them before you start refinement.

{% hint style="success" %}
In most cases, the default phase 1 and phase 2 configs are a good place to start.
{% endhint %}

If you want to generate another set of default config files:

```bash
rk.config --mode both --datamode xray --working-dir 1lj5_processed --file-id 1lj5
```

The `--mode both` flag sets up the default phase 1 and phase 2 workflow. You can edit the saved config files if you want to test a specific condition.

### Optional: Multiple chains in the ASU

If you have multiple chains in the asymmetric unit, ROCKET does not currently refine all chains at once. The current workaround is to refine one chain while keeping the others fixed.

To use this mode, place the docked fixed chain file in `ROCKET_inputs` and rename it to `{file_id}_added_chain.pdb`:

```
ROCKET_inputs
├── {file_id}-pred-aligned.pdb  # Aligned prediction with pseudo-Bs
├── {file_id}-Edata.mtz         # Experimental data in LLG convention
├── {file_id}_added_chain.pdb   # PDB file containing fixed chains, excluding the chain targeted for refinement
```

Then update the refinement config:

```yaml
features:
    ...
    additional_chain: true
    total_chain_copy: 2.0 # 2.0 for dimer in the ASU, 3.0 for trimer, etc. For ligand, 1.0 should be enough
```

### 4. Run refinement

Run the phase 1 config first:

```bash
rk.refine 1lj5_processed/ROCKET_config_phase1.yaml
```

This will start the refinement.

If you want live run tracking, see [Track Refinement with Weights & Biases](/rocket-docs/track-refinement-with-weights-and-biases.md).

The standard workflow is phase 1 followed by the lower learning-rate phase 2:

```bash
rk.refine 1lj5_processed/ROCKET_config_phase2.yaml
```

Phase 2 requires an existing phase 1 folder. If you want to start with a lower learning rate, edit `config_phase1.yaml` directly.

### 5. Finalize geometry and B-factors

We recommend a short standard refinement run afterwards. We used `phenix.refine` in the paper. This helps polish geometry and B-factors on the ROCKET output.

### A note from us

We hope to make ROCKET as useful and general as we can. If you run into setup issues, let us know and we will try to help.

[Create an issue](https://github.com/alisiafadini/ROCKET/issues) in our repo


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://rocket-9.gitbook.io/rocket-docs/launch-with-your-own-x-ray-data.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
