torchref.refinement.optimizers.adam_noise module

Adam optimizer with adaptive noise injection for regularization.

This optimizer extends Adam with scale-invariant Gaussian noise injection to prevent overfitting during crystallographic refinement.

class torchref.refinement.optimizers.adam_noise.AdamWithAdaptiveNoise(params, lr=0.001, alpha=0.1, eps=1e-08, update_weight=0.05, **kwargs)[source]

Bases: Adam

Drop-in replacement for torch.optim.Adam with adaptive, scale-invariant noise injection.

Injects Gaussian noise into gradients scaled by the overfitting ratio between training and test NLL to prevent overfitting.

Parameters:
  • params (iterable) – Model parameters to optimize.

  • lr (float, optional) – Learning rate. Default is 1e-3.

  • alpha (float, optional) – Scaling factor for how much noise to inject per unit overfitting ratio. Default is 0.1.

  • eps (float, optional) – Small constant for numerical stability. Default is 1e-8.

  • update_weight (float, optional) – Weight for exponential moving average of noise scale. Default is 0.05.

  • **kwargs – Additional arguments passed to Adam optimizer.

alpha

Noise scaling factor.

Type:

float

eps

Numerical stability constant.

Type:

float

noise_scale

Current noise scale (dynamically updated).

Type:

float

update_weight

EMA weight for noise scale updates.

Type:

float

__init__(params, lr=0.001, alpha=0.1, eps=1e-08, update_weight=0.05, **kwargs)[source]

Initialize AdamWithAdaptiveNoise.

Parameters:
  • params (iterable) – Model parameters to optimize.

  • lr (float, optional) – Learning rate. Default is 1e-3.

  • alpha (float, optional) – Scaling factor for how much noise to inject per unit overfitting ratio. Default is 0.1.

  • eps (float, optional) – Small constant for numerical stability. Default is 1e-8.

  • update_weight (float, optional) – Weight for exponential moving average of noise scale. Default is 0.05.

  • **kwargs – Additional arguments passed to Adam optimizer.

inject_noise()[source]

Inject scale-invariant Gaussian noise into gradients.

The noise standard deviation is proportional to the gradient and parameter norms, scaled by the current noise_scale and alpha.

step()[source]

Perform a single optimization step with optional noise injection.

Injects noise into gradients before the Adam update if noise_scale > 0.

update_noise_scale(train_nll, test_nll)[source]

Update the noise scale based on the ratio of test to training NLL.

If ratio > 1, the model is overfitting and noise is increased.

Parameters:
  • train_nll (torch.Tensor) – Training set negative log-likelihood.

  • test_nll (torch.Tensor) – Test set negative log-likelihood.