NobodyWho/LFM2.5-8B-A1B-GGUF overview
NobodyWho/LFM2.5 8B A1B GGUF Overview GGUF quantization of LiquidAI's LFM2.5 8B A1B model, prepared for NobodyWho https://github.com/nobodywho ooo/nobodywho : …
Runs locally from ~4.80 GB disk (8 GB VRAM class GPUs with llama.cpp / guIDE).
Repository Files & Downloads
Model Details
| Model ID | NobodyWho/LFM2.5-8B-A1B-GGUF |
|---|---|
| Author | NobodyWho |
| Pipeline | text-generation |
| License | other |
| Base model | LiquidAI/LFM2.5-8B-A1B |
| Last modified | 2026-06-16T05:19:06.000Z |
Model README
---
license: other
license_name: lfm-open-license-v1.0
license_link: https://www.liquid.ai/lfm-open-license
base_model: LiquidAI/LFM2.5-8B-A1B
tags:
- gguf
- nobodywho
- tool-calling
- lfm2
- moe
pipeline_tag: text-generation
library_name: gguf
---
NobodyWho/LFM2.5-8B-A1B-GGUF
Overview
GGUF quantization of LiquidAI's LFM2.5-8B-A1B model, prepared for
NobodyWho: it works with NobodyWho
out of the box, with LiquidAI's recommended sampling metadata embedded in every quant.
LFM2.5-8B-A1B is a sparse Mixture-of-Experts model (8B total / ~1B active per token) built on the
hybrid LFM2 architecture — the fastest model in its size class on both CPU and GPU.
Note: tool calling is unreliable on this model — see the Tool calling note below.
Model Capabilities
- Text generation — instruction-following chat
- Tool calling — supported but unreliable: the model often answers in prose instead of
invoking a tool. NobodyWho suite: 8/14 (F16, Q8_0), 4/14 (Q4_K_M)
- Long context — 128k tokens
- Efficient MoE — 8B total / ~1B active per token
NobodyWho preparation
The upstream GGUF (built from LiquidAI commit feb5e04) already renders tool calls correctly in
the model's native markup — <|tool_call_start|>[get_weather(city="Paris")]<|tool_call_end|> —
so nothing needs patching; NobodyWho just verifies it with the test suite. The
-vendor-sampling quants additionally embed LiquidAI's recommended sampling settings as
general.sampling.* metadata, which NobodyWho reads and applies by default (see
core/src/sampler.rs).
Available Quantizations
| File | Approach | Tool-calling tests |
|------|----------|--------------------|
| LFM2.5-8B-A1B-F16-vendor-sampling.gguf | Vendor sampling injected | 8/14 |
| LFM2.5-8B-A1B-Q8_0-vendor-sampling.gguf | Vendor sampling injected | 8/14 |
| LFM2.5-8B-A1B-Q4_K_M-vendor-sampling.gguf | Vendor sampling injected | 4/14 |
> Tool-calling results from NobodyWho's suite (June 2026). Failures are the model **declining to
> emit a tool call** on complex parameter schemas (sets / tuples / nested lists / dicts), not a
> format error. Vendor sampling does not change the result (verified with and without).
> The -vendor-sampling suffix marks files that embed general.sampling.* metadata.
Quick Start
Using the NobodyWho library:
from nobodywho import Chat
chat = Chat("huggingface:NobodyWho/LFM2.5-8B-A1B-GGUF/LFM2.5-8B-A1B-Q8_0-vendor-sampling.gguf")
response = chat.ask("What is the capital of Denmark?").completed()
print(response) # The capital of Denmark is Copenhagen.
llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="NobodyWho/LFM2.5-8B-A1B-GGUF",
filename="LFM2.5-8B-A1B-Q8_0-vendor-sampling.gguf",
)
Model Specifications
- Parameters: 8B total / ~1B active (MoE)
- Context length: 128,000 tokens
- License: LFM Open License v1.0
- Base model: LiquidAI/LFM2.5-8B-A1B
- Architecture: lfm2moe
Licensing / Credits
Licensed under LFM Open License v1.0 (unchanged from upstream). All model credit belongs to
Liquid AI.
Run NobodyWho/LFM2.5-8B-A1B-GGUF with guIDE
Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.
Source: Hugging Face · Compare models