torchref.base.chain_closure.backbone_utils module

Backbone identification and junction placement utilities for chain closure.

Provides functions to identify backbone atoms, compute backbone torsion angles, estimate secondary structure, and plan junction placement between segments.

torchref.base.chain_closure.backbone_utils.identify_backbone_atoms(pdb)[source]

Map (chainid, resseq) to backbone atom indices {N: idx, CA: idx, C: idx}.

Parameters:: pdb (pd.DataFrame) – PDB DataFrame with columns ‘chainid’, ‘resseq’, ‘name’, ‘index’, ‘resname’.
Returns:: Mapping from (chainid, resseq) to dict of atom name -> atom index for backbone atoms N, CA, C. Only residues with all three atoms present are included.
Return type:: dict

torchref.base.chain_closure.backbone_utils.get_chain_residues(pdb)[source]

Get ordered list of protein residue keys per chain.

Parameters:: pdb (pd.DataFrame) – PDB DataFrame.
Returns:: Mapping from chainid to sorted list of (chainid, resseq) tuples.
Return type:: dict

torchref.base.chain_closure.backbone_utils.compute_backbone_torsions(xyz, backbone_map, chain_residues)[source]

Compute phi, psi, omega torsion angles for each residue.

Parameters:

xyz (torch.Tensor) – Atomic coordinates of shape (N, 3).
backbone_map (dict) – From identify_backbone_atoms().
chain_residues (dict) – From get_chain_residues().

Returns:

Mapping from (chainid, resseq) to {‘phi’: float, ‘psi’: float, ‘omega’: float}. Values are in radians. Missing angles are set to NaN.

Return type:

dict

torchref.base.chain_closure.backbone_utils.estimate_secondary_structure(torsions)[source]

Simple Ramachandran region classification: H (helix), E (sheet), L (loop).

Parameters:: torsions (dict) – From compute_backbone_torsions().
Returns:: Mapping from (chainid, resseq) to ‘H’, ‘E’, or ‘L’.
Return type:: dict

torchref.base.chain_closure.backbone_utils.plan_junction_placement(chain_residues, backbone_map, n_aa_per_segment=18, junction_size=3, ss=None, prefer_loops=True)[source]

Plan segment and junction placement along protein chains.

Divides each chain into segments of ~n_aa_per_segment residues with junction_size-residue junctions between them. Optionally slides junctions to prefer loop regions.

The algorithm: 1. Determine nominal junction positions at every n_aa_per_segment residues. 2. Optionally slide each junction within +-slide_range to prefer loops. 3. Build segments from the non-junction gaps between junctions.

Parameters:

chain_residues (dict) – From get_chain_residues().
backbone_map (dict) – From identify_backbone_atoms().
n_aa_per_segment (int) – Target number of residues per free-DOF segment.
junction_size (int) – Number of residues per junction (slave DOFs).
ss (dict, optional) – Secondary structure assignments from estimate_secondary_structure().
prefer_loops (bool) – If True and ss is provided, slide junctions to prefer loop regions.

Returns:

segments (list of list) – Each inner list contains (chainid, resseq) keys for one segment.
junctions (list of list) – Each inner list contains (chainid, resseq) keys for one junction. Junction i connects segment i to segment i+1.

Return type:

Tuple[List[List[Tuple[str, int]]], List[List[Tuple[str, int]]]]

torchref.base.chain_closure.backbone_utils.get_junction_backbone_indices(junction_residues, backbone_map)[source]

Get ordered backbone atom indices for junction residues.

Parameters:

junction_residues (list) – List of (chainid, resseq) tuples for the junction.
backbone_map (dict) – From identify_backbone_atoms().

Returns:

List of dicts with ‘N’, ‘CA’, ‘C’ atom indices, one per residue.

Return type:

list

Raises:

ValueError – If any junction residue lacks backbone atoms.