Parallel CPU-GPU Execution for LLM Inference on Constrained GPUs

Published in arXiv preprint, 2025

Recommended citation: Fan, J., Zhang, Y., Li, X., & Nikolopoulos, D. S. (2025). "Parallel CPU-GPU Execution for LLM Inference on Constrained GPUs." arXiv preprint arXiv:2506.03296.
Download Paper