WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

Published in Proceedings of the ACM on Measurement and Analysis of Computing Systems (ACM POMACS-SIGMETRICS), 2026

Recommended citation: Li, X., Fan, J., Wang, Q., Spatharakis, D., Ghafouri, S., Vandierendonck, H., John, D., Ji, B., Butt, A. R., & Nikolopoulos, D. S. (2026). "WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching." Proceedings of the ACM on Measurement and Analysis of Computing Systems (ACM POMACS-SIGMETRICS).
Download Paper