Releases · JamePeng/llama-cpp-python · GitHub

02 Dec 15:28

v0.3.17-cu128-Basic-win-20251202

v0.3.17-cu128-Basic-win-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

02 Dec 06:31

v0.3.17-cu128-Basic-linux-20251202

v0.3.17-cu128-Basic-linux-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

02 Dec 09:56

v0.3.17-cu126-Basic-win-20251202

v0.3.17-cu126-Basic-win-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

02 Dec 06:04

v0.3.17-cu126-Basic-linux-20251202

v0.3.17-cu126-Basic-linux-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

02 Dec 09:55

v0.3.17-cu124-Basic-win-20251202

v0.3.17-cu124-Basic-win-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

02 Dec 06:02

v0.3.17-cu124-Basic-linux-20251202

v0.3.17-cu124-Basic-linux-20251202

update: Update Submodule vendor/llama.cpp 7f8ef50..746f9ee
feat: Sync llama.cpp API 20251202
feat: refactor: optimize LlamaGrammar class code
feat: Update llama_grammar.py from vendor/llama.cpp/examples/json-schema-to-grammar.py
feat: Enhanced text-based bootstrapping for Vulkan compilation and improved recognition of Win32 Vulkan libraries.
feat: Enhance Windows CUDA path detection logic
feat: Add new workflow with Basic options for Linux platform (CUDA 12.4, 12.6, 12.8)

Assets 6

21 Nov 03:13

v0.3.17-cu128-Basic-win-20251121

v0.3.17-cu128-Basic-win-20251121

Bump version to 0.3.17

Assets 6

21 Nov 00:50

v0.3.17-cu126-Basic-win-20251121

v0.3.17-cu126-Basic-win-20251121

Bump version to 0.3.17

Assets 6

21 Nov 00:45

v0.3.17-cu124-Basic-win-20251121

v0.3.17-cu124-Basic-win-20251121

Bump version to 0.3.17

Assets 6

19 Nov 16:57

v0.3.16-cu128-Basic-win-20251119

v0.3.16-cu128-Basic-win-20251119

feat: Update Llava15ChatHandler to accept use_gpu, image_min_tokens, and image_max_tokens.

Now can pass theimage_min_tokensparameter in Qwen3VLChatHandler to support bbox grounding tasks.
Add validation to ensure max tokens are not less than min tokens.

feat: Update llama.cpp 20251115 and Move the ggml-related code to _ggml.py.
feat: Remove parameters that are no longer needed: mctx_params.verbosity
feat: Supplementing the use of mtmd_helper_log_set to align with llama.cpp
feat: Add a Basic workflow for cu128 windows wheels
feat: Update Submodule vendor/llama.cpp cb623de..07b0e7a

Assets 6