Releases: JamePeng/llama-cpp-python
v0.3.16-cu126-Basic-win-20251119
feat: Update Llava15ChatHandler to accept use_gpu, image_min_tokens, and image_max_tokens.
- Now can pass the
image_min_tokensparameter in Qwen3VLChatHandler to support bbox grounding tasks. - Add validation to ensure max tokens are not less than min tokens.
feat: Update llama.cpp 20251115 and Move the ggml-related code to _ggml.py.
feat: Remove parameters that are no longer needed: mctx_params.verbosity
feat: Supplementing the use of mtmd_helper_log_set to align with llama.cpp
feat: Add a Basic workflow for cu128 windows wheels
feat: Update Submodule vendor/llama.cpp cb623de..07b0e7a
v0.3.16-cu124-Basic-win-20251119
feat: Update Llava15ChatHandler to accept use_gpu, image_min_tokens, and image_max_tokens.
- Now can pass the
image_min_tokensparameter in Qwen3VLChatHandler to support bbox grounding tasks. - Add validation to ensure max tokens are not less than min tokens.
feat: Update llama.cpp 20251115 and Move the ggml-related code to _ggml.py.
feat: Remove parameters that are no longer needed: mctx_params.verbosity
feat: Supplementing the use of mtmd_helper_log_set to align with llama.cpp
feat: Add a Basic workflow for cu128 windows wheels
feat: Update Submodule vendor/llama.cpp cb623de..07b0e7a
v0.3.16-cu126-Basic-win-20251112
feat: Update LlamaContext API and Release the model pointer when the ctx was failed to create context with model
feat: Refining the AVX instruction compilation workflow
feat: Use httplib to download model from an URL when the libcurl is disabled
feat: Update Submodule vendor/llama.cpp 7d019cf..017ecee
patch: Fixed CMake Error at vendor/llama.cpp/tools/mtmd/CMakeLists.txt:16 (set_target_properties)
v0.3.16-cu124-Basic-win-20251112
feat: Update LlamaContext API and Release the model pointer when the ctx was failed to create context with model
feat: Refining the AVX instruction compilation workflow
feat: Use httplib to download model from an URL when the libcurl is disabled
feat: Update Submodule vendor/llama.cpp 7d019cf..017ecee
patch: Fixed CMake Error at vendor/llama.cpp/tools/mtmd/CMakeLists.txt:16 (set_target_properties)
v0.3.16-cu126-Basic-win-20251109
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".
v0.3.16-cu124-Basic-win-20251109
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".
v0.3.16-cu128-AVX2-win-20251108
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".
v0.3.16-cu128-AVX2-linux-20251108
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".
v0.3.16-cu126-AVX2-win-20251108
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".
v0.3.16-cu126-AVX2-linux-20251108
feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)
Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".