Under construction: I’m still building this website — content and layout may change.

SubSpec

Open-source LLM inference framework for tree-based speculative decoding research.

Published 2025-01-01 Updated 2026-01-11

SubSpec is an open-source LLM inference framework for research on tree-based speculative decoding (SD).

Highlights

Supports multiple offloading pipelines and target/draft model quantization strategies.
Includes an optimized tree-based SD pipeline with full torch.compile support for speedups.

The code is available via the link above.