11 Dicembre 2024

[Launched] Generally Available: Open-source feature update: vLLM model serving in KAITO

KAITO
now supports high throughput model serving with the open-source vLLM serving
engine. In the KAITO inference workspace, you can deploy models using vLLM to
batch process incoming requests, accelerate inference, and optimize your AI
workload by defaul
Source: Microsoft Azure – aggiornamenti

Share: