Port of Facebook's LLaMA model in C/C++
The main goal of llama.cpp is to run the LLaMA model using 4-bit
integer quantization on a MacBook
* Plain C/C++ implementation without dependencies
* Apple silicon first-class citizen - optimized via ARM NEON, Accelerate
and Metal frameworks
* AVX, AVX2 and AVX512 support for x86 architectures
* Mixed F16 / F32 precision
* 2-bit, 3-bit, 4-bit, 5-bit, 6-bit and 8-bit integer quantization support
* CUDA, Metal and OpenCL GPU backend support
The original implementation of llama.cpp was hacked in an evening.
Since then, the project has improved significantly thanks to many
contributions. This project is mainly for educational purposes and
serves as the main playground for developing new features for the
ggml library.
- RPM
- llama-cpp-b4094-10.fc42.x86_64.rpm
- Summary
- Port of Facebook's LLaMA model in C/C++
- URL
- https://github.com/ggerganov/llama.cpp
- Group
- Unspecified
- License
- MIT AND Apache-2.0 AND LicenseRef-Fedora-Public-Domain
- Source
-
llama-cpp-b4094-10.fc42.src.rpm
- Checksum
- b5ab621d5da18c11046bf35b126d08c89052c320eb294e734b323c676111b586
- Signing Signature
- RSA/SHA512, Sun 05 Apr 2026 07:21:05 PM AEST, Key ID d760880122ab8392
- Build Date
- 2025/01/29 23:21:12
- Requires
- Provides
-
libggml-base.so
libggml.so
libllama.so
llama-cpp = b4094-10.fc42
llama-cpp(x86-64) = b4094-10.fc42