torchref.cli.collection_difference_refine module
Collection-based difference refinement using joint scaling with proper bulk solvent correction.
Uses the collection infrastructure (ModelCollection, DatasetCollection, CollectionScaler) for a clean, joint-scaling approach to difference refinement. A single set of scale parameters (overall scale, anisotropy, bulk solvent k_sol/B_sol) is shared across both dark and light datasets.
Output
PDB/CIF files for refined dark and light models, a JSON summary, and a difference MTZ file with the following columns:
- Observed data:
Fo_dark, SIGFo_dark Observed amplitudes and sigma (dark) Fo_light, SIGFo_light Observed amplitudes and sigma (light)
- Differences:
DF F_light - F_dark (scalar difference) SIGDF Propagated sigma on DF WDF Sigma-weighted DF
- Calculated:
Fc_dark, Fc_light |Fc| for dark and mixed models DFc |Fc_light| - |Fc_dark| (scalar) DFc_complex |Fc_light*exp(i*phi) - Fc_dark*exp(i*phi)|
- DED map coefficients (phase-aware):
2mDFop-DFc Weighted 2*DFo_phased - DFc (DED map) mDFop-DFc Weighted DFo_phased - DFc (DED difference map)
- Phases:
PHIC_dark Calculated phase (dark model) PHIC_mixed Calculated phase (mixed model) PHIC_diff Phase of complex Fc difference PHIC_light Calculated phase (pure light model)
- Phase-aware extrapolation:
Grafts calculated phases onto observed amplitudes, then extrapolates the complex structure factors:
F_extp = (Fo_light * exp(i*phi_mixed) - w_d * Fo_dark * exp(i*phi_dark)) / w_l sigma_extp = sqrt(sig_light^2 + w_d^2 * sig_dark^2) / w_l
Fextp |F_extp| 2Fextp-Fc 2*Fextp - Fc (map coefficient) Fextp-Fc Fextp - Fc (difference map coefficient)
- Classic (amplitude-only) extrapolation:
Scalar extrapolation using amplitudes only (no phase information):
F_extc = (|Fo_light| - w_d * |Fo_dark|) / w_l sigma_extc = sqrt(sig_light^2 + w_d^2 * sig_dark^2) / w_l
Fextc, SIGFextc Extrapolated amplitude and sigma 2Fextc-Fc, Fextc-Fc Map coefficients
- Empirical Bayes extrapolation:
Starts from the phase-aware extrapolation, then applies per-reflection amplitude shrinkage towards Fo_dark to regularise noisy high-resolution and weakly-measured reflections:
F_ext = |Fo_dark*exp(i*phi_dark) + dF/f| (phase-aware) sig_ext^2 = (sig_light^2 + sig_dark^2) / f^2 tau^2 = max(<(F_ext - Fo_dark)^2> - <sig_ext^2>, floor) w(h) = tau^2 / (tau^2 + sig_ext^2(h)) F_extb(h) = w(h) * F_ext(h) + (1-w(h)) * Fo_dark(h)
tau^2 is the estimated global signal variance; w(h) is the per-reflection shrinkage weight (high for strong/well-measured reflections, low for noisy ones).
Fextb, SIGFextb Shrinkage-regularised amplitude and sigma 2Fextb-Fc, Fextb-Fc Map coefficients
Examples
torchref.collection-difference-refine \
-dm dark.pdb -lm light.pdb \
-dsf dark.mtz -lsf light.mtz \
--fraction 0.37 -o output/
- torchref.cli.collection_difference_refine.setup_model_collection(pdb_dark, pdb_light, fractions, cif, d_min, device, verbose, hydrogenate=False)[source]
Load models and create a ModelCollection.
- Parameters:
hydrogenate (bool) – If True, add explicit hydrogens to both models. H atoms participate in geometry/VDW restraints (preventing clashes) but are excluded from structure factor calculations.
- torchref.cli.collection_difference_refine.setup_dataset_collection(sf_dark, sf_light, d_min, device, column_names_dark=None, column_names_light=None)[source]
Load reflection data and create a DatasetCollection.
- torchref.cli.collection_difference_refine.setup_scaler(dataset_collection, model_collection, device, verbose=1)[source]
Create a CollectionScaler with per-component solvent models.
- torchref.cli.collection_difference_refine.compute_rfactors(model, data, scaler)[source]
Compute R-work/R-free using forward_mixed for proper solvent.
- torchref.cli.collection_difference_refine.setup_loss_state(dataset_collection, model_collection, scaler, target_weights, device, similarity_alpha=2.0)[source]
Build LossState with collection-aware targets.
Geometry and ADP restraints are applied only to the light base model (the dark model is a frozen reference).
- torchref.cli.collection_difference_refine.compute_bayes_extrapolated_amplitudes(Fobs_dark, Fobs_light, sig_dark, sig_light, phi_dark, phi_mixed, f, *, tau_sq_floor=0.0001)[source]
Empirical Bayes shrinkage estimator for extrapolated SF amplitudes.
Estimates per-reflection shrinkage weights from the propagated variance of the extrapolation, then shrinks the phase-aware extrapolated amplitude towards Fo_dark:
F_ext = |F_dark*e^(iφ_d) + ΔF/f| (phase-aware amplitude) σ_ext² = (σ_light² + σ_dark²) / f² τ² = max(<(F_ext - Fo_dark)²> - <σ_ext²>, floor) w(h) = τ² / (τ² + σ_ext²(h)) F_extb = w(h)·F_ext + (1-w(h))·Fo_dark (amplitude shrinkage)
- Parameters:
Fobs_dark (Tensor (N,)) – Observed amplitudes.
Fobs_light (Tensor (N,)) – Observed amplitudes.
sig_dark (Tensor (N,)) – Measurement uncertainties.
sig_light (Tensor (N,)) – Measurement uncertainties.
phi_dark (Tensor (N,)) – Calculated phases (radians) for dark and mixed models.
phi_mixed (Tensor (N,)) – Calculated phases (radians) for dark and mixed models.
f (float or Tensor (scalar)) – Excited-state population fraction.
tau_sq_floor (float) – Minimum signal variance.
- Returns:
F_ext_bayes (Tensor (N,)) – Phase-aware extrapolated amplitudes (before shrinkage).
var_ext_bayes (Tensor (N,)) – Posterior variance per reflection.
w_shrinkage (Tensor (N,)) – Per-reflection shrinkage weights.
tau_sq (float) – Estimated global signal variance.
- torchref.cli.collection_difference_refine.write_results_mtz(dc, mc, scaler, filename)[source]
Write difference / extrapolated map coefficients to an MTZ file.
- Parameters:
dc (DatasetCollection)
mc (ModelCollection)
scaler (CollectionScaler)
filename (str) – Output MTZ path.
- torchref.cli.collection_difference_refine.optimize_lbfgs(state, parameters, max_iter, nsteps, n_clean, verbose)[source]
Run a block of LBFGS optimisation steps via
LossState.step().state.stephandles the closure, NaN validation, and automatically disablesrequires_gradon any loss-relevant leaves outsideparameters— in particular the dark model’s leaves, which appear in the difference target’s autograd graph but are intentionally not in the optimizer’s intent. The dark model effectively becomes a frozen reference at the autograd level for the duration of each step.