onednn_w8a16_fp8(x, qweight, scales[, bias]) W8A16 GEMM — fp16/bf16 activations × FP8_E4M3 weights, per-column scale onednn_w4a16(x, weight, scales, zeros[, bias]) W4A16 GEMM — fp16/bf16 activations × ...
pybind11 - You need to add the pybind11 repo as a submodule to your project (or install it into the OS) vxl To avoid messing with your host machine, you can build ...