GraySoft
Projects Models Compare Cloud benchmarks FAQ Download guIDE →
Model Intelligence Sheet

ledgergap/Pollux-4B-Judge-GGUF overview

Pollux 4B Judge GGUF This repository contains GGUF versions of ai forever/Pollux 4B Judge https://huggingface.co/ai forever/Pollux 4B Judge for local inference…

ggufllama.cpplmstudioqwen3polluxrussianllm-as-a-judgequantizedtext-generationrubase_model:ai-forever/Pollux-4B-Judgebase_model:quantized:ai-forever/Pollux-4B-Judgelicense:mitendpoints_compatibleregion:usconversational

Runs locally from ~3.99 GB disk (4 GB VRAM class GPUs with llama.cpp / guIDE).

Downloads
0
Likes
0
Pipeline
text-generation
Author

Repository Files & Downloads

2 GGUF files detected
Direct downloads for local inference
FileTypeQuantizationSizeLink
Pollux-4B-Judge.BF16.ggufGGUFGGUF7.50 GBDownload
Pollux-4B-Judge.Q8_0.ggufGGUFGGUF3.99 GBDownload

Model Details

Model IDledgergap/Pollux-4B-Judge-GGUF
Authorledgergap
Pipelinetext-generation
Licensemit
Base modelai-forever/Pollux-4B-Judge
Last modified2026-06-08T21:01:43.000Z

Model README

---

license: mit

language:

  • ru

base_model: ai-forever/Pollux-4B-Judge

base_model_relation: quantized

library_name: gguf

pipeline_tag: text-generation

tags:

  • gguf
  • llama.cpp
  • lmstudio
  • qwen3
  • pollux
  • russian
  • llm-as-a-judge
  • quantized

quantized_by: ledgergap

---

Pollux-4B-Judge GGUF

This repository contains GGUF versions of ai-forever/Pollux-4B-Judge for local inference with llama.cpp, LM Studio, and other GGUF-compatible runtimes.

Pollux-4B-Judge is a Russian-oriented LLM-as-a-judge model based on Qwen3-4B. It is intended for evaluating model answers against a specific criterion and scoring rubric.

Files

| File | Type | Quantized | Notes |

|---|---:|---:|---|

| Pollux-4B-Judge.BF16.gguf | BF16 GGUF conversion | No | High-precision reference version |

| Pollux-4B-Judge.Q8_0.gguf | Q8_0 GGUF quantization | Yes | High-quality quantized version |

Which file should I use?

Use Pollux-4B-Judge.BF16.gguf if you want the highest-quality reference version.

Use Pollux-4B-Judge.Q8_0.gguf if you want a practical local version with lower memory usage and minimal expected quality loss.

Recommended inference settings

For judge-style usage, the original model card uses:

| Setting | Value |

|---|---:|

| Temperature | 0.0 |

| Max tokens | 512 |

For local GGUF inference, choose a context length large enough to fit the full evaluation prompt: instruction, reference answer, evaluated answer, criterion, and rubric. A practical starting point is 8192, but this is a local runtime recommendation rather than an official value from the original model card.

The model is intended to evaluate one criterion per request.

Prompt format

Recommended prompt structure:

### Задание для оценки:
{instruction}

### Эталонный ответ:
{reference_answer}

### Ответ для оценки:
{answer}

### Критерий оценки:
{criterion}

### Шкала оценивания по критерию:
{rubric}

Use with llama.cpp

BF16:

llama-server -hf ledgergap/Pollux-4B-Judge-GGUF:BF16 -c 8192 -ngl 99

Q8_0:

llama-server -hf ledgergap/Pollux-4B-Judge-GGUF:Q8_0 -c 8192 -ngl 99

Use with LM Studio

Open LM Studio and paste this repository URL into the model search/download field:

https://huggingface.co/ledgergap/Pollux-4B-Judge-GGUF

Then select either the BF16 or Q8_0 GGUF file.

Original model

Original model: ai-forever/Pollux-4B-Judge

Run ledgergap/Pollux-4B-Judge-GGUF with guIDE

Download guIDE — the AI-native code editor with local LLM inference and 69 built-in tools.

Download guIDE → · Browse 524k+ models · Compare models

Source: Hugging Face · Compare models