< Back to authors

Zewen Shen
Member of Technical Staff, Foundations
Zewen Shen is a Member of Technical Staff in the Foundations team at Cohere, where he works on LLM inference acceleration. He holds a Ph.D. from the University of Toronto, with a background in computational mathematics that carries into his work on model quantization and mixed-precision computation.
Multiple Authors - Apr 22, 2026
Production-Ready W4A8: vLLM Integration and Quality Recovery Techniques Explained