Skip to content

Releases: JamePeng/llama-cpp-python

v0.3.16-cu126-Basic-win-20251119

19 Nov 19:19

Choose a tag to compare

v0.3.16-cu124-Basic-win-20251119

19 Nov 17:50

Choose a tag to compare

v0.3.16-cu126-Basic-win-20251112

v0.3.16-cu124-Basic-win-20251112

v0.3.16-cu126-Basic-win-20251109

09 Nov 07:40

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".

v0.3.16-cu124-Basic-win-20251109

09 Nov 07:47

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".

v0.3.16-cu128-AVX2-win-20251108

08 Nov 08:36
3d96053

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".

v0.3.16-cu128-AVX2-linux-20251108

08 Nov 05:15
3d96053

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".

v0.3.16-cu126-AVX2-win-20251108

08 Nov 08:47
3d96053

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".

v0.3.16-cu126-AVX2-linux-20251108

08 Nov 04:55
3d96053

Choose a tag to compare

feat: Update Submodule vendor/llama.cpp 48bd265..299f5d7
feat: Update llama.cpp API and supplementing the State/sessions API
feat: Better Qwen3VL chat template. (Thank to @alcoftTAO)

Note: llama_chat_template now allows for more flexible input of parameters required by the model and the application of more complex Jinja formats.
The initial input parameters for Qwen3VLChatHandler have changed: "use_think_prompt" has been changed to "force_reasoning".