Question 1

What is Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF?

Accepted Answer

--- license: apache-2.0 datasets: - Congliu/Chinese-DeepSeek-R1-Distill-data-110k language: - zh base_model: AiCloser/Qwen2.5-0.5B-Instruct-Thinking pipeline_tag: text-generation library_name: transformers tags: - llama-cpp - gguf-my-repo --- # Karsh-CAI/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF This model was converted to GGUF format from [`AiCloser/Qwen2.5-0.5B-Instruct-Thinking`](https://huggingface.co/AiCloser/Qwen2.5-0.5B-Instruct-Thinking) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. Refer to the [original model card](https://huggingface.co/AiCloser/Qwen2.5-0.5B-Instruct-Thinking) for more details on the model. ## Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) ```bash brew install llama.cpp ``` Invoke the llama.cpp server or the CLI. ### CLI: ```bash llama-cli --hf-repo Karsh-CAI/Qwen2.5-0.5…

Question 2

What license applies to Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF?

Accepted Answer

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Question 3

How do I run Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF need?

Accepted Answer

Runs locally from ~506.5 MB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF overview

Repository Files & Downloads

Model Details

Model README

Karsh-CAI/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF

Use with llama.cpp

CLI:

Server:

Run Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF with guIDE

Model ID	Dzluck/Qwen2.5-0.5B-Instruct-Thinking-Q8_0-GGUF
Author	Dzluck
Pipeline	text-generation
License	apache-2.0
Base model	AiCloser/Qwen2.5-0.5B-Instruct-Thinking
Last modified	2026-06-19T11:57:27.000Z