torchref.refinement.optimizers package
Optimizers for crystallographic refinement.
This module provides custom optimizers and optimization functions: - AdamWithAdaptiveNoise: Adam with scale-invariant noise injection - optimize_simulated_annealing: Simulated annealing optimization - optimize_stochastic_sa: Stochastic SA for internal coordinates (per-parameter) - optimize_stochastic_sa_batch: Stochastic SA for internal coordinates (batch) - optimize_internal_coord_sa: Universal SA for internal coordinates with auto-calibration - optimize_gradient_sa: Gradient-based SA with per-parameter acceptance - refine_sa_lbfgs: Combined Metropolis SA + LBFGS pipeline - optimize_momentum_sa: Phenix-style SA (gradient descent + momentum + noise) - refine_momentum_sa_lbfgs: Combined Phenix-style SA + LBFGS pipeline - ExploratoryLBFGS: LBFGS with automatic landscape exploration via Lanczos - LangevinSA: BAOAB Langevin dynamics with simulated annealing
- class torchref.refinement.optimizers.AdamWithAdaptiveNoise(params, lr=0.001, alpha=0.1, eps=1e-08, update_weight=0.05, **kwargs)[source]
Bases:
AdamDrop-in replacement for torch.optim.Adam with adaptive, scale-invariant noise injection.
Injects Gaussian noise into gradients scaled by the overfitting ratio between training and test NLL to prevent overfitting.
- Parameters:
params (iterable) – Model parameters to optimize.
lr (float, optional) – Learning rate. Default is 1e-3.
alpha (float, optional) – Scaling factor for how much noise to inject per unit overfitting ratio. Default is 0.1.
eps (float, optional) – Small constant for numerical stability. Default is 1e-8.
update_weight (float, optional) – Weight for exponential moving average of noise scale. Default is 0.05.
**kwargs – Additional arguments passed to Adam optimizer.
- __init__(params, lr=0.001, alpha=0.1, eps=1e-08, update_weight=0.05, **kwargs)[source]
Initialize AdamWithAdaptiveNoise.
- Parameters:
params (iterable) – Model parameters to optimize.
lr (float, optional) – Learning rate. Default is 1e-3.
alpha (float, optional) – Scaling factor for how much noise to inject per unit overfitting ratio. Default is 0.1.
eps (float, optional) – Small constant for numerical stability. Default is 1e-8.
update_weight (float, optional) – Weight for exponential moving average of noise scale. Default is 0.05.
**kwargs – Additional arguments passed to Adam optimizer.
- inject_noise()[source]
Inject scale-invariant Gaussian noise into gradients.
The noise standard deviation is proportional to the gradient and parameter norms, scaled by the current noise_scale and alpha.
- step()[source]
Perform a single optimization step with optional noise injection.
Injects noise into gradients before the Adam update if noise_scale > 0.
- update_noise_scale(train_nll, test_nll)[source]
Update the noise scale based on the ratio of test to training NLL.
If ratio > 1, the model is overfitting and noise is increased.
- Parameters:
train_nll (torch.Tensor) – Training set negative log-likelihood.
test_nll (torch.Tensor) – Test set negative log-likelihood.
- class torchref.refinement.optimizers.ExploratoryLBFGS(params, lr=1.0, max_iter=20, history_size=100, m_modes=10, m_lanczos_iter=None, eigenvalue_threshold=0.01, participation_threshold=0.05, scan_points=20, scan_step_size=0.1, max_exploration_cycles=5, hvp_epsilon=0.0001, convergence_grad_threshold=1e-05, convergence_loss_threshold=1e-07, convergence_param_threshold=1e-06, n_stable=3, verbose=1)[source]
Bases:
OptimizerLBFGS optimizer with automatic landscape exploration via Lanczos analysis.
Composes with (rather than subclasses) torch.optim.LBFGS. After the internal LBFGS converges, performs eigenanalysis of the Hessian to find degenerate/flat directions, scans along them, and hops to better basins if found.
- Parameters:
params (iterable) – Parameters to optimize.
lr (float) – LBFGS learning rate. Default: 1.0.
max_iter (int) – LBFGS max line search iterations per step. Default: 20.
history_size (int) – LBFGS Hessian approximation memory. Default: 100.
m_modes (int) – Number of lowest eigenmodes to compute. Default: 10.
m_lanczos_iter (int, optional) – Lanczos iterations. Default: 2*m_modes + 10.
eigenvalue_threshold (float) – Mode is degenerate if eigenvalue < threshold * median(positive). Default: 0.01.
participation_threshold (float) – Parameter participates if |component| > threshold * ||mode||. Default: 0.05.
scan_points (int) – Evaluation points per scan direction. Default: 20.
scan_step_size (float) – Step size in parameter space units. Default: 0.1.
max_exploration_cycles (int) – Cap on explore-hop cycles. Default: 5.
hvp_epsilon (float) – Finite-difference epsilon for Hessian-vector products. Default: 1e-4.
convergence_grad_threshold (float) – Gradient norm convergence threshold. Default: 1e-5.
convergence_loss_threshold (float) – Loss change convergence threshold. Default: 1e-7.
convergence_param_threshold (float) – Parameter change convergence threshold. Default: 1e-6.
n_stable (int) – Consecutive converged steps required. Default: 3.
verbose (int) – Verbosity level: 0=silent, 1=summary, 2=detailed. Default: 1.
- __init__(params, lr=1.0, max_iter=20, history_size=100, m_modes=10, m_lanczos_iter=None, eigenvalue_threshold=0.01, participation_threshold=0.05, scan_points=20, scan_step_size=0.1, max_exploration_cycles=5, hvp_epsilon=0.0001, convergence_grad_threshold=1e-05, convergence_loss_threshold=1e-07, convergence_param_threshold=1e-06, n_stable=3, verbose=1)[source]
- property phase: OptimizerPhase
Current optimizer phase.
- class torchref.refinement.optimizers.LangevinSA(params, dt=0.01, friction=10.0, T_initial=2500.0, T_final=0.01, total_steps=1000, cooling_schedule='exponential', adaptive_masses=True, mass_beta=0.999, mass_eps=1e-08, gradient_clip=None, max_step_size=0.1)[source]
Bases:
OptimizerBAOAB Langevin dynamics integrator with simulated annealing.
Implements the BAOAB splitting scheme (Leimkuhler & Matthews, 2013) for gradient-guided exploration with thermodynamically correct noise. One gradient evaluation per step via staggered B steps.
Adaptive masses from EMA of squared gradients provide automatic scale invariance across all parameter types (xyz, B-factors, occupancies, torsions, etc.).
Call
calibrate()before the main loop to probe parameter stiffness and warm up the adaptive masses without moving the structure.- Args:
params: Iterable of parameters or param groups. dt: Integration timestep. friction: Friction coefficient gamma. Controls thermalization speed. T_initial: Starting temperature. T_final: Final temperature. total_steps: Total number of annealing steps. cooling_schedule: ‘exponential’ or ‘linear’. adaptive_masses: Use EMA of grad² as per-element masses. mass_beta: EMA decay for adaptive masses. mass_eps: Floor for adaptive masses (numerical stability). gradient_clip: Optional max gradient norm (per-parameter). max_step_size: Maximum displacement per element per full step.
Velocities are clamped so |v * dt| <= max_step_size.
- __init__(params, dt=0.01, friction=10.0, T_initial=2500.0, T_final=0.01, total_steps=1000, cooling_schedule='exponential', adaptive_masses=True, mass_beta=0.999, mass_eps=1e-08, gradient_clip=None, max_step_size=0.1)[source]
- property temperature
Current temperature from the annealing schedule.
- property current_step
- property total_steps
- property kinetic_energy
Sum of 0.5 * m * v^2 over all parameters (diagnostic).
- calibrate(closure, n_steps=10)[source]
Probe parameter stiffness over n_steps, then rollback.
Runs small random perturbations to collect gradient statistics, sets the adaptive masses from the observed grad², then restores all parameters to their original values and initialises velocities from Maxwell-Boltzmann with correctly scaled masses.
- Args:
- closure: Same closure as for
step()— must zero_grad, compute loss, call backward, and return loss.
n_steps: Number of probing steps.
- closure: Same closure as for
- step(closure)[source]
Perform one BAOAB Langevin dynamics step.
Tracks the best-loss configuration and rolls back to it when the loss exceeds
loss_rollback_factortimes the best loss seen so far. This prevents the dynamics from permanently damaging the structure while still allowing uphill exploration.- Args:
- closure: A callable that re-evaluates the model and returns the
loss. The closure must call
loss.backward()before returning.
- Returns:
The loss value from the closure evaluation.
- class torchref.refinement.optimizers.MomentumStochasticSA(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, T_initial=1.0, T_final=0.01, total_steps=1000)[source]
Bases:
AdamAdam-based SA where noise is scaled by the adaptive learning rate, giving automatic scale invariance across parameters.
Submodules
- torchref.refinement.optimizers.adam_noise module
- torchref.refinement.optimizers.exploratory_lbfgs module
OptimizerPhaseModeScanPointBasinParameterGroupConvergenceTrackerConvergenceTracker.grad_thresholdConvergenceTracker.loss_thresholdConvergenceTracker.param_thresholdConvergenceTracker.n_stableConvergenceTracker.grad_normsConvergenceTracker.loss_changesConvergenceTracker.param_changesConvergenceTracker.prev_lossConvergenceTracker.prev_paramsConvergenceTracker.update()ConvergenceTracker.is_convergedConvergenceTracker.reset()ConvergenceTracker.__init__()
ExplorationResultLanczosErrorExploratoryLBFGS
- torchref.refinement.optimizers.internal_coord_sa module
- torchref.refinement.optimizers.langevin_sa module
- torchref.refinement.optimizers.momentum_sa module
- torchref.refinement.optimizers.simulated_annealing module