Releases · JamePeng/llama-cpp-python

20 Dec 03:36

v0.3.18-cu130-AVX2-win-20251220

171bd19

v0.3.18-cu130-AVX2-win-20251220

Bump version to 0.3.18

Signed-off-by: JamePeng <jame_peng@sina.com>

Assets 4

20 Dec 01:07

github-actions

v0.3.18-cu130-AVX2-linux-20251220

171bd19

v0.3.18-cu130-AVX2-linux-20251220

Bump version to 0.3.18

Signed-off-by: JamePeng <jame_peng@sina.com>

Assets 4

20 Dec 03:59

github-actions

v0.3.18-cu128-AVX2-win-20251220

171bd19

v0.3.18-cu128-AVX2-win-20251220 Latest

Latest

Bump version to 0.3.18

Signed-off-by: JamePeng <jame_peng@sina.com>

Assets 4

20 Dec 01:14

github-actions

v0.3.18-cu128-AVX2-linux-20251220

171bd19

v0.3.18-cu128-AVX2-linux-20251220

Bump version to 0.3.18

Changelog here：llama-cpp-python 0.3.18 Changelog

Signed-off-by: JamePeng ( jame_peng@sina.com )

Assets 6

20 Dec 02:27

github-actions

v0.3.18-cu126-AVX2-linux-20251220

171bd19

v0.3.18-cu126-AVX2-linux-20251220

Bump version to 0.3.18

Signed-off-by: JamePeng <jame_peng@sina.com>

Assets 6

20 Dec 00:50

github-actions

v0.3.18-cu124-AVX2-linux-20251220

171bd19

v0.3.18-cu124-AVX2-linux-20251220

Bump version to 0.3.18

Changelog here：llama-cpp-python 0.3.18 Changelog

Signed-off-by: JamePeng ( jame_peng@sina.com )

Assets 6

09 Dec 18:01

github-actions

v0.3.17-cu130-AVX2-win-20251209

46ad8c0

v0.3.17-cu130-AVX2-win-20251209

feat: perf: optimize LlamaModel.metadata reading performance

Increase initial buffer size to 16KB to eliminate re-allocations for large chat templates.
Cache ctypes function references to reduce loop overhead.
Repeated model loading can result in a cumulative speed improvement of 1-3%.

fixed: Attempting to fix the AVX2 workflow: Missing GGML_FMA and GGML_F16C may cause an OSError: [WinError -1073741795] Windows Error 0xc000001d error on processors that support AVX2 instructions.

feat: Update Submodule vendor/llama.cpp 2fa51c1..6b82eb7
feat: Sync ggml-zendnn : add ZenDNN backend for AMD CPUs
feat: workflow: Added workflows for compiling with CUDA 13.0.2 on Windows and Linux.
feat: feat: Added the scan path for CUDA 13.0+ dynamic link libraries under Windows system ($env:CUDA_PATH\bin\x64)

Assets 6

09 Dec 17:32

github-actions

v0.3.17-cu130-AVX2-linux-20251209

46ad8c0

v0.3.17-cu130-AVX2-linux-20251209

feat: perf: optimize LlamaModel.metadata reading performance

Increase initial buffer size to 16KB to eliminate re-allocations for large chat templates.
Cache ctypes function references to reduce loop overhead.
Repeated model loading can result in a cumulative speed improvement of 1-3%.

fixed: Attempting to fix the AVX2 workflow: Missing GGML_FMA and GGML_F16C may cause an OSError: [WinError -1073741795] Windows Error 0xc000001d error on processors that support AVX2 instructions.

Assets 6

09 Dec 18:16

github-actions

v0.3.17-cu128-AVX2-win-20251209

46ad8c0

v0.3.17-cu128-AVX2-win-20251209

feat: perf: optimize LlamaModel.metadata reading performance

Increase initial buffer size to 16KB to eliminate re-allocations for large chat templates.
Cache ctypes function references to reduce loop overhead.
Repeated model loading can result in a cumulative speed improvement of 1-3%.

fixed: Attempting to fix the AVX2 workflow: Missing GGML_FMA and GGML_F16C may cause an OSError: [WinError -1073741795] Windows Error 0xc000001d error on processors that support AVX2 instructions.

Assets 6

09 Dec 15:39

github-actions

v0.3.17-cu128-AVX2-linux-20251209

46ad8c0

v0.3.17-cu128-AVX2-linux-20251209

feat: perf: optimize LlamaModel.metadata reading performance

Increase initial buffer size to 16KB to eliminate re-allocations for large chat templates.
Cache ctypes function references to reduce loop overhead.
Repeated model loading can result in a cumulative speed improvement of 1-3%.

fixed: Attempting to fix the AVX2 workflow: Missing GGML_FMA and GGML_F16C may cause an OSError: [WinError -1073741795] Windows Error 0xc000001d error on processors that support AVX2 instructions.

Assets 6

Releases: JamePeng/llama-cpp-python

v0.3.18-cu130-AVX2-win-20251220

Uh oh!

v0.3.18-cu130-AVX2-linux-20251220

Uh oh!

v0.3.18-cu128-AVX2-win-20251220

Uh oh!

v0.3.18-cu128-AVX2-linux-20251220

Uh oh!

v0.3.18-cu126-AVX2-linux-20251220

Uh oh!

v0.3.18-cu124-AVX2-linux-20251220

Uh oh!

v0.3.17-cu130-AVX2-win-20251209

Uh oh!

v0.3.17-cu130-AVX2-linux-20251209

Uh oh!

v0.3.17-cu128-AVX2-win-20251209

Uh oh!

v0.3.17-cu128-AVX2-linux-20251209

Uh oh!