LLM decode memory wall: first bytes/token model
Separating prefill from decode and building a citable traffic ledger for weight reads, KV-cache reads, activation movement, and runtime overhead.
Research notes, engineering evidence, and citable article pages for PCCX accelerator work.
Article-first browsing.
Grouped research tracks.
Use versioned release metadata.
Start here for memory-wall research.
Current implementation evidence gate.
Compatibility represented without execution.
Separating prefill from decode and building a citable traffic ledger for weight reads, KV-cache reads, activation movement, and runtime overhead.
This draft separates prefill from decode and builds a first PCCX traffic ledger for weight reads, KV-cache reads, activation movement, and runtime overhead.
This article tracks post-synthesis and post-implementation timing states without turning partial evidence into a timing-clean claim.
This article represents model, board, and runtime compatibility as deterministic data before any execution path exists.
This article separates public research artifacts from commercial IP boundaries for ProCore and ASICKit.
This article defines how public figures and claims map to commits, commands, logs, parsers, checksums, citations, and limitations.
This article maps public research artifacts to commercial evidence gates without unsupported production, bitstream, board, timing, or throughput claims.
Evaluation and partner inquiries use a separate evidence-gated path.