A llama.cpp

● up · checked 35m ago

C++ runtime for running LLMs locally on CPU + GPU. The backbone of every privacy-LLM stack.

Visit llama.cpp →

At a glance

▣ no-KYC signup
⌬ Open-source codebase
◈ Non-custodial — you hold keys
☷ Self-hostable

ANON LOCAL OPEN-SOURCE REFERENCE

Non-custodial Open-source Self-host CLI

Review

The reference inference engine for running Llama / Mistral / Qwen / DeepSeek family models on local hardware. Quantised GGUF formats let a 30B model run on a consumer GPU. Most privacy-focused LLM products (Ollama, LM Studio, Jan) wrap this. Apple Silicon, NVIDIA, AMD ROCm all supported.

Fees

Free · MIT · C++ · CPU/CUDA/ROCm/Metal

Audit trail — receipts for the editorial claim

● Upstream up · HTTP 200 · 944ms · checked 35m ago
○ No .onion mirror listed
✎ Last manual verification 2026-05-13 (<7d)
⌕ See curator log for llama.cpp

Embed this listing — for the provider

If you run llama.cpp, drop this on your site. The badge auto-reflects the latest grade and pick status, links back to this page.

<a href="https://www.xmr.club/ai/llama-cpp">
  <img src="https://www.xmr.club/badge/llama-cpp.svg" alt="Listed on xmr.club — llama.cpp" height="32" />
</a>

Reviews — moderated · rules

No approved reviews yet. Be the first.

Add a review

Honest, brand-neutral feedback welcome. A curator approves before it appears here. No JS, no signup required.

⚠ report an issue / suggest correction · view markdown twin · share on X