Here is how the prefill versus generation split exposes GPU structural inefficiencies in AI processor designs.