Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Data Flow

Overview

                       ┌──────────────┐
 PDB / mmCIF / BCIF ──>│              ├──> Vec<MoleculeEntity>
 MRC / CCP4         ──>│   Adapters   ├──> Density (SurfaceEntity)
 DCD                ──>│              ├──> Vec<DcdFrame>
                       └──────┬───────┘
                              │
                              v
                  ┌───────────────────────┐
                  │       Entities        │
                  │                       │
                  │  MoleculeEntity       │
                  │  SurfaceEntity        │
                  └──┬────────┬────────┬──┘
                     │        │        │
                     v        v        v
              ┌──────────┐ ┌────────┐ ┌─────────┐
              │ Analysis │ │  Ops   │ │  Codec  │
              │          │ │        │ │         │
              │ dssp     │ │ kabsch │ │serialize│
              │ bonds    │ │ align  │ │serialize│
              │ disulfide│ │extract │ │_assembly│
              │ aabb     │ │        │ │         │
              └──────────┘ └────────┘ └────┬────┘
                                           │
                                           v
                                  FFI / IPC / Python

Analysis, Transform, and Codec are independent — use any combination depending on what you need.

1. Parsing

Every structure adapter returns Vec<MoleculeEntity>:

let entities = pdb_file_to_entities(Path::new("1ubq.pdb"))?;
let entities = mmcif_file_to_entities(Path::new("3nez.cif"))?;
let entities = bcif_file_to_entities(Path::new("1ubq.bcif"))?;

Density and trajectory adapters return their own types:

let density = mrc_file_to_density(Path::new("emd_1234.map"))?;
let frames = dcd_file_to_frames(Path::new("trajectory.dcd"))?;

2. Entity splitting

split_into_entities groups atoms by:

  1. Chain ID + molecule type for polymers (one entity per chain)
  2. Chain ID + residue number for small molecules (one entity each)
  3. All waters into a single Bulk entity
  4. All solvents into a single Bulk entity

Each entity gets a unique EntityId.

3. Analysis

let (ss_types, hbonds) = detect_dssp(&backbone_residues);
let bonds = infer_bonds(&atoms, DEFAULT_TOLERANCE);
let disulfides = detect_disulfide_bonds(&atoms);
let aabb = entity.aabb();

4. Transforms

let (rotation, translation) = kabsch_alignment(&reference_ca, &target_ca);
transform_entities(&mut entities, rotation, translation);
let ca_positions = extract_ca_positions(&entities);

5. Serialization

For sending to C/C++/Python consumers:

// COORDS01 (flat atom array)
let bytes = serialize(&merge_entities(&entities))?;

// ASSEM01 (preserves entity types)
let bytes = serialize_assembly(&entities)?;