Back to Repository Index

vllm-project/vllm

PythonApache License 2.085,411 stars

A high-throughput and memory-efficient inference and serving engine for LLMs

#amd#blackwell#cuda#deepseek#deepseek-v3#gpt#gpt-oss#inference#kimi#llama#llm#llm-serving#model-serving#moe#openai#pytorch#qwen#qwen3#tpu#transformerllm

Repository Info

Stars85,411
Forks18,964
Watchers85,411
Open Issues5,548
LicenseApache License 2.0
Last Pushed5h ago