duyntnet/faro-yi-9b-dpo-imatrix-gguf IQ3_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

duyntnet/faro-yi-9b-dpo-imatrix-gguf overview

This is the DPO version of wenbopan/Faro-Yi-9B. Compared to Faro-Yi-9B and Yi-9B-200K, the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the Open LLM Leaderboard, it ranks #2 among all 9B models, #1 among all Yi-9B variants. | Metric | MMLU | GSM8K | hellaswag | truthfulqa | ai2_arc | winogrande | CMMLU | | ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- | | Yi-9B-200K | 65.73 | 50.49 | 56.72 | 33.80 | 69.25 | 71.67 | 71.97 | | Faro-Yi-9B | 68.80 | 63.08 | 57.28 | 40.86 | 72.58 | 71.11 | 73.28 | | Faro-Yi-9B-DPO | 69.98 | 66.11 | 59.04 | 48.01 | 75.68 | 73.40 | 75.23 | Faro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench !image/png

transformersggufimatrixFaro-Yi-9B-DPOtext-generationenarxiv:2303.08774license:otherregion:usconversational

duyntnet/faro-yi-9b-dpo-imatrix-gguf visual

Downloads

245

Likes

Pipeline

text-generation

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

27 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Faro-Yi-9B-DPO-IQ1_M.gguf	GGUF	IQ1_M	2.03 GB	Download
Faro-Yi-9B-DPO-IQ1_S.gguf	GGUF	IQ1_S	1.88 GB	Download
Faro-Yi-9B-DPO-IQ2_M.gguf	GGUF	IQ2_M	2.89 GB	Download
Faro-Yi-9B-DPO-IQ2_S.gguf	GGUF	IQ2_S	2.68 GB	Download
Faro-Yi-9B-DPO-IQ2_XS.gguf	GGUF	IQ2_XS	2.52 GB	Download
Faro-Yi-9B-DPO-IQ2_XXS.gguf	GGUF	IQ2_XXS	2.29 GB	Download
Faro-Yi-9B-DPO-IQ3_M.gguf	GGUF	IQ3_M	3.78 GB	Download
Faro-Yi-9B-DPO-IQ3_S.gguf	GGUF	IQ3_S	3.64 GB	Download
Faro-Yi-9B-DPO-IQ3_XS.gguf	GGUF	IQ3_XS	3.46 GB	Download
Faro-Yi-9B-DPO-IQ3_XXS.gguf	GGUF	IQ3_XXS	3.24 GB	Download
Faro-Yi-9B-DPO-IQ4_NL.gguf	GGUF	IQ4_NL	4.70 GB	Download
Faro-Yi-9B-DPO-IQ4_XS.gguf	GGUF	IQ4_XS	4.46 GB	Download
Faro-Yi-9B-DPO-Q2_K.gguf	GGUF	Q2_K	3.12 GB	Download
Faro-Yi-9B-DPO-Q2_K_S.gguf	GGUF	Q2_K_S	2.90 GB	Download
Faro-Yi-9B-DPO-Q3_K_L.gguf	GGUF	Q3_K_L	4.37 GB	Download
Faro-Yi-9B-DPO-Q3_K_M.gguf	GGUF	Q3_K_M	4.03 GB	Download
Faro-Yi-9B-DPO-Q3_K_S.gguf	GGUF	Q3_K_S	3.63 GB	Download
Faro-Yi-9B-DPO-Q4_0.gguf	GGUF	—	4.71 GB	Download
Faro-Yi-9B-DPO-Q4_1.gguf	GGUF	—	5.19 GB	Download
Faro-Yi-9B-DPO-Q4_K_M.gguf	GGUF	Q4_K_M	4.96 GB	Download
Faro-Yi-9B-DPO-Q4_K_S.gguf	GGUF	Q4_K_S	4.72 GB	Download
Faro-Yi-9B-DPO-Q5_0.gguf	GGUF	—	5.70 GB	Download
Faro-Yi-9B-DPO-Q5_1.gguf	GGUF	—	6.19 GB	Download
Faro-Yi-9B-DPO-Q5_K_M.gguf	GGUF	Q5_K_M	5.83 GB	Download
Faro-Yi-9B-DPO-Q5_K_S.gguf	GGUF	Q5_K_S	5.69 GB	Download
Faro-Yi-9B-DPO-Q6_K.gguf	GGUF	Q6_K	6.75 GB	Download
Faro-Yi-9B-DPO-Q8_0.gguf	GGUF	—	8.74 GB	Download

Model Details Live

Model Slug

duyntnet/faro-yi-9b-dpo-imatrix-gguf

Author

duyntnet

Pipeline Task

text-generation

Library

transformers

Created

2025-02-18

Last Modified

2025-02-19

Gated

Private

HF SHA

e53f8a354976729ce45c00f8bce67fa1c1de6f97

License

other

Language

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "license": "other",
    "language": [
      "en"
    ],
    "pipeline_tag": "text-generation",
    "inference": false,
    "tags": [
      "transformers",
      "gguf",
      "imatrix",
      "Faro-Yi-9B-DPO"
    ],
    "frontmatter": {
      "license": "other",
      "language": [
        "en"
      ],
      "pipeline_tag": "text-generation",
      "inference": "false",
      "tags": [
        "transformers",
        "gguf",
        "imatrix",
        "Faro-Yi-9B-DPO"
      ]
    },
    "hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/ArlnloL4aPfiiD6kUqaSH.png",
    "summary": "This is the DPO version of wenbopan/Faro-Yi-9B. Compared to Faro-Yi-9B and Yi-9B-200K, the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the Open LLM Leaderboard, it ranks **#2** among all 9B models, **#1** among all Yi-9B variants. | **Metric**              | **MMLU**  | **GSM8K** | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** | | ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- | | **Yi-9B-200K**          | 65.73     | 50.49     | 56.72         | 33.80          | 69.25       | 71.67          | 71.97     | | **Faro-Yi-9B**          | 68.80     | 63.08     | 57.28         | 40.86          | 72.58       | 71.11          | 73.28     | | **Faro-Yi-9B-DPO**      | **69.98** | **66.11** | **59.04**     | **48.01**      | **75.68**   | **73.40**      | **75.23** | Faro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench !image/png",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: other\nlanguage:\n- en\npipeline_tag: text-generation\ninference: false\ntags:\n- transformers\n- gguf\n- imatrix\n- Faro-Yi-9B-DPO\n---\nQuantizations of https://huggingface.co/wenbopan/Faro-Yi-9B-DPO\n\n### Inference Clients/UIs\n* [llama.cpp](https://github.com/ggerganov/llama.cpp)\n* [KoboldCPP](https://github.com/LostRuins/koboldcpp)\n* [ollama](https://github.com/ollama/ollama)\n* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)\n* [jan](https://github.com/janhq/jan)\n* [GPT4All](https://github.com/nomic-ai/gpt4all)\n---\n\n# From original readme\n\nThis is the DPO version of [wenbopan/Faro-Yi-9B](https://huggingface.co/wenbopan/Faro-Yi-9B). Compared to Faro-Yi-9B and [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K), the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), it ranks **#2** among all 9B models, **#1** among all Yi-9B variants.\n\n| **Metric**              | **MMLU**  | **GSM8K** | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** |\n| ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- |\n| **Yi-9B-200K**          | 65.73     | 50.49     | 56.72         | 33.80          | 69.25       | 71.67          | 71.97     |\n| **Faro-Yi-9B**          | 68.80     | 63.08     | 57.28         | 40.86          | 72.58       | 71.11          | 73.28     |\n| **Faro-Yi-9B-DPO**      | **69.98** | **66.11** | **59.04**     | **48.01**      | **75.68**   | **73.40**      | **75.23** |\n\nFaro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/ArlnloL4aPfiiD6kUqaSH.png)\n\n## How to Use\n\nFaro-Yi-9B-DPO uses the chatml template and performs well in both short and long contexts. For longer inputs under **24GB of VRAM**, I recommend to use vLLM to have a max prompt of 32K. Setting `kv_cache_dtype=\"fp8_e5m2\"` allows for 48K input length. 4bit-AWQ quantization on top of that can boost input length to 160K, albeit with some performance impact. Adjust `max_model_len` arg in vLLM or `config.json` to avoid OOM.\n\n\n```python\nimport io\nimport requests\nfrom PyPDF2 import PdfReader\nfrom vllm import LLM, SamplingParams\n\nllm = LLM(model=\"wenbopan/Faro-Yi-9B-DPO\", kv_cache_dtype=\"fp8_e5m2\", max_model_len=100000)\n\npdf_data = io.BytesIO(requests.get(\"https://arxiv.org/pdf/2303.08774.pdf\").content)\ndocument = \"\".join(page.extract_text() for page in PdfReader(pdf_data).pages) # 100 pages\n\nquestion = f\"{document}\\n\\nAccording to the paper, what is the parameter count of GPT-4?\"\nmessages = [ {\"role\": \"user\", \"content\": question} ] # 83K tokens\nprompt = llm.get_tokenizer().apply_chat_template(messages, add_generation_prompt=True, tokenize=False)\noutput = llm.generate(prompt, SamplingParams(temperature=0.8, max_tokens=500))\nprint(output[0].outputs[0].text)\n# Yi-9B-200K:      175B. GPT-4 has 175B \\nparameters. How many models were combined to create GPT-4? Answer: 6. ...\n# Faro-Yi-9B: GPT-4 does not have a publicly disclosed parameter count due to the competitive landscape and safety implications of large-scale models like GPT-4. ...\n```\n\n\n<details> <summary>Or With Transformers</summary>\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel = AutoModelForCausalLM.from_pretrained('wenbopan/Faro-Yi-9B-DPO', device_map=\"cuda\")\ntokenizer = AutoTokenizer.from_pretrained('wenbopan/Faro-Yi-9B-DPO')\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant. Always answer with a short response.\"},\n    {\"role\": \"user\", \"content\": \"Tell me what is Pythagorean theorem like you are a pirate.\"}\n]\n\ninput_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors=\"pt\").to(model.device)\ngenerated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)\nresponse = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...\n```\n\n</details>\n",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "imatrix",
    "Faro-Yi-9B-DPO",
    "text-generation",
    "en",
    "arxiv:2303.08774",
    "license:other",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 245,
  "gated": false,
  "private": false,
  "last_modified": "2025-02-19T03:21:07.000Z",
  "created_at": "2025-02-18T23:23:46.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "67b51682d33342e9bca31bcc",
  "id": "duyntnet/Faro-Yi-9B-DPO-imatrix-GGUF",
  "modelId": "duyntnet/Faro-Yi-9B-DPO-imatrix-GGUF",
  "sha": "e53f8a354976729ce45c00f8bce67fa1c1de6f97",
  "createdAt": "2025-02-18T23:23:46.000Z",
  "lastModified": "2025-02-19T03:21:07.000Z",
  "author": "duyntnet",
  "downloads": 245,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 29
}

duyntnet/faro-yi-9b-dpo-imatrix-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard