turboquant-py implements the TurboQuant and QJL vector quantization algorithms from Google Research (ICLR 2026 / AISTATS 2026). It compresses high-dimensional floating-point vectors to 1-4 bits per ...
* The basic programming model of Triton. * The `triton.jit` decorator, which is used to define Triton kernels. * The best practices for validating and benchmarking your custom ops against native ...