-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[MISC] change NIXL compatibility hash logging level to debug
kv-connector
#30182
opened Dec 6, 2025 by
AuruTus
Loading…
5 tasks
[Model] Move Related to multi-modality (#4194)
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
multimodal_cpu_fields definition to field config
multi-modality
#30181
opened Dec 6, 2025 by
DarkLight1337
Loading…
5 tasks
[Misc] Fix circular import in vllm.transformers_utils.config
fb-exported
meta-exported
ready
ONLY add when PR is ready to merge/full CI is needed
#30179
opened Dec 6, 2025 by
yeqcharlotte
Loading…
[ROCm][MXFP4] Enable FP4 MLA BMM support
rocm
Related to AMD ROCm
v1
#30177
opened Dec 6, 2025 by
dllehr-amd
Loading…
5 tasks
[Misc][Core] Remove unused
req_index increment in scheduler
v1
#30176
opened Dec 6, 2025 by
ivanium
Loading…
5 tasks
[Frontend] Add --uvicorn-access-log-exclude-paths option
frontend
#30175
opened Dec 6, 2025 by
GeoffreyWang1117
Loading…
4 tasks done
[Bugfix] Improve DCP error message with backend hint
v1
#30174
opened Dec 6, 2025 by
GeoffreyWang1117
Loading…
2 tasks done
[BugFix] Fix ONLY add when PR is ready to merge/full CI is needed
v1
assert batch_descriptor.num_tokens == num_tokens_padded
nvidia
ready
#30173
opened Dec 6, 2025 by
LucasWilkinson
Loading…
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector
kv-connector
tpu
Related to Google TPUs
v1
#30166
opened Dec 6, 2025 by
ivanium
Loading…
5 tasks
Nvidia ModelOpt workaround for issue 28072
nvidia
quantization
#30164
opened Dec 6, 2025 by
shengliangxu
Loading…
[Deepseek] Fix OOM during DeepSeek R1 startup
deepseek
Related to DeepSeek models
v1
#30162
opened Dec 5, 2025 by
MatthewBonanni
Loading…
3 of 5 tasks
[Perf] Optimize ONLY add when PR is ready to merge/full CI is needed
group_topk kernel, 1.9% Throughput improvement, 2.1% TPOT improvemnt
ready
#30159
opened Dec 5, 2025 by
yewentao256
Loading…
feat: add TxtSlicesDataset to allow sampling slices from txt file for benchmarking
performance
Performance-related issues
#30156
opened Dec 5, 2025 by
hypdeb
Loading…
update torchao safetensors impl
ready
ONLY add when PR is ready to merge/full CI is needed
#30155
opened Dec 5, 2025 by
liangel-02
Loading…
Integration for Ray LLM with load_format=runai_streamer
#30154
opened Dec 5, 2025 by
jiangwu300
Loading…
5 tasks
Bump nvshmem to 3.3.24 and fix CUDA 13 installation
nvidia
#30149
opened Dec 5, 2025 by
dmitry-tokarev-nv
Loading…
5 tasks
[Renderer] Separate out Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
kv-connector
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
ready-run-all-tests
Trigger CI with all tests for wide-ranging PRs
speculative-decoding
structured-output
tpu
Related to Google TPUs
v1
RendererConfig from ModelConfig
deepseek
#30145
opened Dec 5, 2025 by
DarkLight1337
Loading…
5 tasks
[Misc] Remove pad_for_cudagraphs from config
needs-rebase
nvidia
speculative-decoding
v1
#30143
opened Dec 5, 2025 by
LucasWilkinson
Loading…
Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head)
documentation
Improvements or additions to documentation
llama
Related to Llama models
needs-rebase
speculative-decoding
v1
#30141
opened Dec 5, 2025 by
eldarkurtic
Loading…
[OpenAI] Add parameter metadata to validation errors
frontend
#30134
opened Dec 5, 2025 by
R3hankhan123
Loading…
2 of 5 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.