torchref.base.kernels.separable_triton_kernel module

Separable Gaussian density splatting via Triton.

Factorizes exp(-alpha * f^T G f) into 1D Gaussian tables along each fractional axis, with optional 2D cross-term corrections for non-orthogonal cells. One program per atom. All 1D/2D tables live in a small per-atom scratch buffer (~0.8-3 KB) that stays hot in L1 cache.

Eliminates the real_space_grid tensor (~500 MB) and all PBC matrix operations that the fused kernel requires.

Forward + backward kernels with full autograd support for xyz, b, occ.

For non-orthogonal cells, uses combined exponent exp(-alpha*r²) to avoid numerical overflow from separate diagonal × cross-term exp() products.

torchref.base.kernels.separable_triton_kernel.separable_density_gpu(density_map, xyz, b, inv_frac_matrix, frac_matrix, A, B, occ, radius_angstrom)[source]

Separable Gaussian density splatting on GPU via Triton.

Eliminates the real_space_grid tensor and PBC matrix operations by working directly in fractional space with the metric tensor. Precomputes 1D Gaussian tables per atom and gathers per sphere voxel.

Parameters:
  • density_map ((nx, ny, nz) — density grid to update (not modified in-place))

  • xyz ((N_atoms, 3) — Cartesian positions)

  • b ((N_atoms,) — isotropic B-factors)

  • inv_frac_matrix ((3, 3) — Cartesian→fractional)

  • frac_matrix ((3, 3) — fractional→Cartesian)

  • A ((N_atoms, 5) — ITC92 amplitudes)

  • B ((N_atoms, 5) — ITC92 widths)

  • occ ((N_atoms,) — occupancies)

  • radius_angstrom (float — cutoff radius)

Return type:

torch.Tensor — updated density map