torchref.refinement.targets.amber_target module

AMBER14/GAFF2 Force Field as a Differentiable Restraint.

Uses OpenMM to evaluate the AMBER14 energy for current model coordinates. Analytical forces from OpenMM are bridged into PyTorch autograd via a custom Function, making the energy fully differentiable w.r.t. xyz.

Non-standard residues (HETATM not in AMBER14_STANDARD) are parameterised automatically via antechamber/GAFF2. Results are cached under PATH_TORCHREF_DATA / "amber_cache" / {resname}/.

Intended workflow:

# Canonical one-liner — strips altlocs, adds H, then build target:
mh = (Model(verbose=0, strip_H=True)
      .load_pdb('structure.pdb')
      .strip_altlocs()
      .generate_hydrogens())
target = AmberTarget(model=mh)                               # protein-only
target = AmberTarget(model=mh, residue_charges={'LIG': -1})  # with ligand

loss = target()          # kJ/mol per atom
loss.backward()
# xyz gradient is now populated with AMBER forces

Performance note

OpenMM’s Modeller.addHydrogens() is 4–5× faster when H atoms are already present in the model (it refines positions rather than building from scratch). For pure-protein structures: init ~3 s with H, ~11 s from heavy atoms only. Gradient and energy are identical either way (H are stripped from the atom map; n_model_atoms changes only the energy normalisation).

Design notes

  • No pdbfixer dependency. H atoms are handled by OpenMM’s Modeller (standard-residues path) or tleap (GAFF2 path).

  • Altloc atoms are filtered before building the OpenMM system: only the primary conformation (altloc == ‘’ or ‘A’) is used.

  • OXT and H atoms are excluded from the PDB written to tleap; tleap re-adds them via its C-terminal and H-addition templates.

  • H positions in the OpenMM context are set once at construction and are NOT updated during forward() — a good approximation for small refinement steps (< 0.1 Å heavy-atom displacement).

  • model_to_omm maps model-atom index → OpenMM atom index for HEAVY atoms only. Model H atoms receive -1 and are skipped in forward().

torchref.refinement.targets.amber_target.AMBER14_STANDARD: frozenset = frozenset({'A', 'ACE', 'ALA', 'ARG', 'ASN', 'ASP', 'C', 'CA', 'CL', 'CYS', 'CYX', 'DA', 'DC', 'DG', 'DT', 'FE', 'G', 'GLN', 'GLU', 'GLY', 'HID', 'HIE', 'HIP', 'HIS', 'HOH', 'ILE', 'K', 'LEU', 'LYS', 'MET', 'MG', 'MN', 'NA', 'NME', 'PHE', 'PRO', 'SER', 'T', 'THR', 'TRP', 'TYR', 'U', 'VAL', 'WAT', 'ZN'})

Residue names covered by AMBER14 force field — antechamber not needed.

class torchref.refinement.targets.amber_target.AmberTarget(model=None, cutoff=5.0, normalize_by_atoms=True, residue_charges=None, gaff2_files=None, verbose=0)[source]

Bases: ModelTarget

Differentiable AMBER14/GAFF2 force-field energy restraint.

On construction the target:

  1. Detects non-standard residues (HETATM not in AMBER14_STANDARD).

  2. Runs antechamber + parmchk2 (parallel, cached) for each non-standard residue.

  3. Builds an OpenMM system:

    • Standard path (no non-standard residues): filter model PDB to primary conformation + heavy atoms, use openmm.app.Modeller to re-add H with AMBER14-compatible names, create system with ForceField('amber14-all.xml').

    • GAFF2 path (with non-standard residues): same protein PDB (additionally removing OXT) handed to tleap together with each ligand’s mol2 via combine{}. Combined AMBER14+GAFF2 topology is parameterised by parmed.

  4. Creates an OpenMM Context on the platform that matches the model’s device: CUDA for model.device.type == 'cuda', CPU otherwise. Falls back CUDA → OpenCL → CPU if the preferred platform is unavailable.

  5. Builds a model-atom → OpenMM-atom index map so that only heavy atoms are transferred; H positions are kept from the initial OpenMM setup.

Parameters:
  • model (Model) –

    TorchRef model. Heavy-atom-only models (strip_H=True) are accepted. H atoms are added internally by OpenMM’s Modeller or tleap and are NOT included in the atom map or gradient.

    Passing a model that already has H atoms (via model.generate_hydrogens() or loading a PDB with H) speeds up initialisation ~4× because Modeller.addHydrogens() converges faster from existing positions.

    Required for GAFF2 ligands: antechamber’s BCC charge scheme runs a semiempirical QM step (sqm) that requires a fully protonated molecule. If the model has no H atoms for a non-standard residue, an explicit error is raised. Call model.generate_hydrogens() or load the PDB with strip_H=False before creating the target.

  • cutoff (float) – Non-bonded cutoff in Angstroms. Default 5.0.

  • normalize_by_atoms (bool) – If True the energy is divided by the number of model atoms. Default True.

  • residue_charges (dict[str, int], optional) – Net formal charge per non-standard residue name, e.g. {'LIG': -1, 'ATP': -4}. Residues not listed default to 0 with a warning.

  • verbose (int) – Verbosity level (0 = silent, 1 = informational, 2 = debug).

name: str = 'amber'
__init__(model=None, cutoff=5.0, normalize_by_atoms=True, residue_charges=None, gaff2_files=None, verbose=0)[source]

Initialize model target.

Parameters:
  • model (Model, optional) – Reference to the Model object (optional for empty init).

  • verbose (int, optional) – Verbosity level. Default is 0.

forward()[source]

Compute AMBER14 energy for current model coordinates.

Returns:

Scalar energy in kJ/mol (or kJ/mol/atom if normalize_by_atoms). Gradient flows to model.xyz via OpenMM analytical forces.

Return type:

torch.Tensor

stats()[source]

Return target statistics for the logging pipeline.