torchref.base.targets.triton.adp_simu module

Triton forward + Triton backward for the ADP-similarity (SIMU) target.

The math is trivial — gather two B-factors, subtract, Gaussian NLL — but this target showed the widest math/target gap (~0.23 forward) in benchmarking, so it’s a clean win to tritonize.

All scalar parameters (sigma, log_sigma, grad_out) are passed as 0-D device tensors and tl.load``ed in-kernel — no ``.item() host syncs.

torchref.base.targets.triton.adp_simu.adp_simu_math_triton(b, pair_indices, simu_sigma)[source]

Triton-backed ADP similarity (SIMU) Gaussian NLL.

Drop-in replacement for torchref.base.targets.adp.adp_simu_math().