torchref.base.chain_closure.backbone_utils module

Backbone identification and junction placement utilities for chain closure.

Provides functions to identify backbone atoms, compute backbone torsion angles, estimate secondary structure, and plan junction placement between segments.

torchref.base.chain_closure.backbone_utils.identify_backbone_atoms(pdb)[source]

Map (chainid, resseq) to backbone atom indices {N: idx, CA: idx, C: idx}.

Parameters:

pdb (pd.DataFrame) – PDB DataFrame with columns ‘chainid’, ‘resseq’, ‘name’, ‘index’, ‘resname’.

Returns:

Mapping from (chainid, resseq) to dict of atom name -> atom index for backbone atoms N, CA, C. Only residues with all three atoms present are included.

Return type:

dict

torchref.base.chain_closure.backbone_utils.get_chain_residues(pdb)[source]

Get ordered list of protein residue keys per chain.

Parameters:

pdb (pd.DataFrame) – PDB DataFrame.

Returns:

Mapping from chainid to sorted list of (chainid, resseq) tuples.

Return type:

dict

torchref.base.chain_closure.backbone_utils.compute_backbone_torsions(xyz, backbone_map, chain_residues)[source]

Compute phi, psi, omega torsion angles for each residue.

Parameters:
  • xyz (torch.Tensor) – Atomic coordinates of shape (N, 3).

  • backbone_map (dict) – From identify_backbone_atoms().

  • chain_residues (dict) – From get_chain_residues().

Returns:

Mapping from (chainid, resseq) to {‘phi’: float, ‘psi’: float, ‘omega’: float}. Values are in radians. Missing angles are set to NaN.

Return type:

dict

torchref.base.chain_closure.backbone_utils.estimate_secondary_structure(torsions)[source]

Simple Ramachandran region classification: H (helix), E (sheet), L (loop).

Parameters:

torsions (dict) – From compute_backbone_torsions().

Returns:

Mapping from (chainid, resseq) to ‘H’, ‘E’, or ‘L’.

Return type:

dict

torchref.base.chain_closure.backbone_utils.plan_junction_placement(chain_residues, backbone_map, n_aa_per_segment=18, junction_size=3, ss=None, prefer_loops=True)[source]

Plan segment and junction placement along protein chains.

Divides each chain into segments of ~n_aa_per_segment residues with junction_size-residue junctions between them. Optionally slides junctions to prefer loop regions.

The algorithm: 1. Determine nominal junction positions at every n_aa_per_segment residues. 2. Optionally slide each junction within +-slide_range to prefer loops. 3. Build segments from the non-junction gaps between junctions.

Parameters:
  • chain_residues (dict) – From get_chain_residues().

  • backbone_map (dict) – From identify_backbone_atoms().

  • n_aa_per_segment (int) – Target number of residues per free-DOF segment.

  • junction_size (int) – Number of residues per junction (slave DOFs).

  • ss (dict, optional) – Secondary structure assignments from estimate_secondary_structure().

  • prefer_loops (bool) – If True and ss is provided, slide junctions to prefer loop regions.

Returns:

  • segments (list of list) – Each inner list contains (chainid, resseq) keys for one segment.

  • junctions (list of list) – Each inner list contains (chainid, resseq) keys for one junction. Junction i connects segment i to segment i+1.

Return type:

Tuple[List[List[Tuple[str, int]]], List[List[Tuple[str, int]]]]

torchref.base.chain_closure.backbone_utils.get_junction_backbone_indices(junction_residues, backbone_map)[source]

Get ordered backbone atom indices for junction residues.

Parameters:
  • junction_residues (list) – List of (chainid, resseq) tuples for the junction.

  • backbone_map (dict) – From identify_backbone_atoms().

Returns:

List of dicts with ‘N’, ‘CA’, ‘C’ atom indices, one per residue.

Return type:

list

Raises:

ValueError – If any junction residue lacks backbone atoms.