Data Flow
Overview
┌──────────────┐
PDB / mmCIF / BCIF ──>│ ├──> Vec<MoleculeEntity>
MRC / CCP4 ──>│ Adapters ├──> Density (SurfaceEntity)
DCD ──>│ ├──> Vec<DcdFrame>
└──────┬───────┘
│
v
┌───────────────────────┐
│ Entities │
│ │
│ MoleculeEntity │
│ SurfaceEntity │
└──┬────────┬────────┬──┘
│ │ │
v v v
┌──────────┐ ┌────────┐ ┌─────────┐
│ Analysis │ │ Ops │ │ Codec │
│ │ │ │ │ │
│ dssp │ │ kabsch │ │serialize│
│ bonds │ │ align │ │serialize│
│ disulfide│ │extract │ │_assembly│
│ aabb │ │ │ │ │
└──────────┘ └────────┘ └────┬────┘
│
v
FFI / IPC / Python
Analysis, Transform, and Codec are independent — use any combination depending on what you need.
1. Parsing
Every structure adapter returns Vec<MoleculeEntity>:
let entities = pdb_file_to_entities(Path::new("1ubq.pdb"))?;
let entities = mmcif_file_to_entities(Path::new("3nez.cif"))?;
let entities = bcif_file_to_entities(Path::new("1ubq.bcif"))?;
Density and trajectory adapters return their own types:
let density = mrc_file_to_density(Path::new("emd_1234.map"))?;
let frames = dcd_file_to_frames(Path::new("trajectory.dcd"))?;
2. Entity splitting
split_into_entities groups atoms by:
- Chain ID + molecule type for polymers (one entity per chain)
- Chain ID + residue number for small molecules (one entity each)
- All waters into a single
Bulkentity - All solvents into a single
Bulkentity
Each entity gets a unique EntityId.
3. Analysis
let (ss_types, hbonds) = detect_dssp(&backbone_residues);
let bonds = infer_bonds(&atoms, DEFAULT_TOLERANCE);
let disulfides = detect_disulfide_bonds(&atoms);
let aabb = entity.aabb();
4. Transforms
let (rotation, translation) = kabsch_alignment(&reference_ca, &target_ca);
transform_entities(&mut entities, rotation, translation);
let ca_positions = extract_ca_positions(&entities);
5. Serialization
For sending to C/C++/Python consumers:
// COORDS01 (flat atom array)
let bytes = serialize(&merge_entities(&entities))?;
// ASSEM01 (preserves entity types)
let bytes = serialize_assembly(&entities)?;