torchref.model.segmented_internal_coordinates module
Segmented internal coordinate parametrization for atomic structures.
This module provides the SegmentedInternalCoordinateTensor class which addresses the “lever arm problem” in internal coordinate parametrization by breaking the molecular chain into independent segments, each with its own rigid body parameters.
Key features: - Segments the molecule into groups of N amino acids (default: 3 per segment) - Each segment has independent internal coordinates (bonds, angles, torsions) - Each segment has rigid body parameters (position + orientation) - Shallow spanning trees within segments (depth ~15-30 instead of ~1000) - Changes in one segment don’t propagate to distant segments - Fully differentiable reconstruction from internal coordinates - Parallelized construction for fast initialization - Fused ring systems (indole in TRP, purines, etc.) treated as single rigid groups
This approach solves the lever arm problem where small torsion changes near the root of a deep tree cause large displacements at distant atoms.
- class torchref.model.segmented_internal_coordinates.SegmentedInternalCoordinateTensor(initial_xyz, pdb, n_aa_per_segment=3, bond_cutoff=2.0, cif_dict=None, requires_grad=True, dtype=None, device=None)[source]
Bases:
DeviceMixin,CachedForwardMixin,ModuleParameter wrapper using segmented internal coordinates.
Stores: per-segment bond_lengths, angles, torsions, segment_positions, segment_orientations Reconstructs: Cartesian xyz on forward()
This provides a physically meaningful parametrization that avoids the lever arm problem by breaking the molecule into independent segments, each with shallow spanning trees and rigid body parameters.
- Parameters:
initial_xyz (torch.Tensor) – Initial Cartesian coordinates of shape (N, 3).
pdb (pd.DataFrame) – PDB DataFrame with columns ‘chainid’, ‘resseq’, ‘name’, ‘index’.
n_aa_per_segment (int, optional) – Number of amino acids per segment. Default is 3.
bond_cutoff (float, optional) – Distance cutoff for bond detection in Angstroms. Default is 2.0.
requires_grad (bool, optional) – Whether parameters should have gradients. Default is True.
dtype (torch.dtype, optional) – Data type for tensors. Default is same as initial_xyz.
device (torch.device, optional) – Device for tensors. Default is same as initial_xyz.
- bond_lengths
Bond length parameters in Angstroms.
- Type:
nn.Parameter
- angles
Angle parameters in radians.
- Type:
nn.Parameter
- torsions
Torsion angle parameters in radians.
- Type:
nn.Parameter
- segment_positions
Absolute positions of segment root atoms.
- Type:
nn.Parameter
- segment_orientations
ZYZ Euler angle orientations for each segment.
- Type:
nn.Parameter
- AA_NAMES = frozenset({'ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU', 'GLY', 'HIS', 'ILE', 'LEU', 'LYS', 'MET', 'MSE', 'PHE', 'PRO', 'SEC', 'SER', 'THR', 'TRP', 'TYR', 'VAL'})
- __init__(initial_xyz, pdb, n_aa_per_segment=3, bond_cutoff=2.0, cif_dict=None, requires_grad=True, dtype=None, device=None)[source]
Initialize SegmentedInternalCoordinateTensor.
- Parameters:
initial_xyz (torch.Tensor) – Initial Cartesian coordinates of shape (N, 3).
pdb (pd.DataFrame) – PDB DataFrame with columns ‘chainid’, ‘resseq’, ‘name’, ‘index’, ‘resname’.
n_aa_per_segment (int, optional) – Number of amino acids per segment. Default is 3.
bond_cutoff (float, optional) – Distance cutoff for bond detection in Angstroms (used as fallback). Default is 2.0.
cif_dict (dict, optional) – CIF dictionary containing bond definitions per residue type. If provided, bonds are determined from chemical definitions rather than distances, which is more robust for structures with poor geometry. Expected format: cif_dict[resname][‘bonds’] is a DataFrame with ‘atom1’ and ‘atom2’ columns.
requires_grad (bool, optional) – Whether parameters should have gradients. Default is True.
dtype (torch.dtype, optional) – Data type for tensors. Default is same as initial_xyz.
device (torch.device, optional) – Device for tensors. Default is same as initial_xyz.
- property dtype
Return the dtype of tensors.
- property device
Return the device of tensors.
- forward()[source]
Reconstruct Cartesian xyz from internal coordinates.
Uses fully vectorized operations for maximum performance.
- Returns:
Reconstructed Cartesian coordinates of shape (N, 3).
- Return type:
- shake(magnitude=0.1)[source]
Add Gaussian noise to internal parameters.
- Parameters:
magnitude (float, optional) – Standard deviation of Gaussian noise. Default is 0.1.
- Returns:
New Cartesian coordinates after perturbation.
- Return type:
- fix(selection=None, freeze_at_current=True)[source]
Fix (freeze) atoms to use fixed xyz coordinates.
- Parameters:
selection (torch.Tensor, slice, or None) – Boolean mask or indices of atoms to fix.
freeze_at_current (bool, optional) – If True, store current coordinates for selected atoms.
- refine(selection=None, rebuild=True)[source]
Make atoms refinable.
- Parameters:
selection (torch.Tensor, slice, or None) – Boolean mask or indices of atoms to make refinable.
rebuild (bool, optional) – If True, rebuild internal coordinates from fixed_xyz.