GraySoft
Model Comparison

prism-ml/bonsai-8b-ggufvsunsloth/qwen3-14b-gguf

Side-by-side comparison of prism-ml/bonsai-8b-gguf and unsloth/qwen3-14b-gguf: downloads, license, context length, tasks, and benchmarks.

prism-ml/bonsai-8b-gguf

prism-ml · text-generation

End-to-end 1-bit language model for llama.cpp (CUDA, Metal, CPU) > **14.1x** smaller than FP16 | **6.2x** faster on RTX 4090 | **4-5x** lower energy/token

unsloth/qwen3-14b-gguf

unsloth · text-generation

If you are using llama.cpp, Ollama, Open WebUI etc., you can add /think and /no_think to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations. Here is an example of mu…

Side-by-side Specifications

prism-ml/bonsai-8b-ggufunsloth/qwen3-14b-gguf
Authorprism-mlunsloth
Pipeline Tasktext-generationtext-generation
Libraryllama.cpptransformers
Downloads83,30939,267
Likes618123
LicenseUnknownUnknown
Context Length
Created2026-03-182025-04-28
Last Modified2026-04-162025-06-08
Tags
llama.cppgguf1-bitllama-cppcudametalon-deviceprismmlbonsaitext-generation
transformersggufqwen3text-generationqwenunslothenarxiv:2309.00071base_model:Qwen/Qwen3-14Bbase_model:quantized:Qwen/Qwen3-14B

View full details: prism-ml/bonsai-8b-gguf · unsloth/qwen3-14b-gguf