Question 1

What is distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf?

Accepted Answer

--- license: apache-2.0 base_model: distil-labs/distil-qwen3-1.7b-customer-support-deferral tags: - tool-calling - function-calling - customer-support - airline - model-cascade - deferral - distil-labs - gguf - llama-cpp language: - en pipeline_tag: text-generation library_name: llama.cpp --- # Distil-Qwen3-1.7B-Customer-Support-Deferral — GGUF GGUF build of [distil-labs/distil-qwen3-1.7b-customer-support-deferral](https://huggingface.co/distil-labs/distil-qwen3-1.7b-customer-support-deferral), for serving with [llama.cpp](https://github.com/ggerganov/llama.cpp). A fine-tuned Qwen3-1.7B model for multi-turn **airline customer support** that runs as the small tier of a **two-model cascade**: it handles most support turns itself and **defers genuinely-hard turns to a larger model** by emitting a `defer_to_larger_model` tool call. Every assistant action is a single tool call — including ta…

Question 2

What license applies to distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf?

Accepted Answer

License: apache-2.0. Verify terms on Hugging Face before commercial use.

Question 3

How do I run distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf locally?

Accepted Answer

Download a GGUF file from this page and load it in guIDE or llama.cpp. Pipeline task: text-generation.

Question 4

How much VRAM or disk space does distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf need?

Accepted Answer

Runs locally from ~1.03 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf overview

Repository Files & Downloads

Model Details

Model README

Distil-Qwen3-1.7B-Customer-Support-Deferral — GGUF

Results

Usage (llama.cpp)

Demo App

Quantizations

Links

License

Run distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf with guIDE

Model ID	distil-labs/distil-qwen3-1.7b-customer-support-deferral-gguf
Author	distil-labs
Pipeline	text-generation
License	apache-2.0
Base model	distil-labs/distil-qwen3-1.7b-customer-support-deferral
Last modified	2026-06-08T00:45:37.000Z