GraySoft
Projects Models About FAQ Contact Download guIDE →

duyntnet/faro-yi-9b-dpo-imatrix-gguf IQ3_M GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

duyntnet/faro-yi-9b-dpo-imatrix-gguf overview

This is the DPO version of wenbopan/Faro-Yi-9B. Compared to Faro-Yi-9B and Yi-9B-200K, the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the Open LLM Leaderboard, it ranks #2 among all 9B models, #1 among all Yi-9B variants. | Metric | MMLU | GSM8K | hellaswag | truthfulqa | ai2_arc | winogrande | CMMLU | | ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- | | Yi-9B-200K | 65.73 | 50.49 | 56.72 | 33.80 | 69.25 | 71.67 | 71.97 | | Faro-Yi-9B | 68.80 | 63.08 | 57.28 | 40.86 | 72.58 | 71.11 | 73.28 | | Faro-Yi-9B-DPO | 69.98 | 66.11 | 59.04 | 48.01 | 75.68 | 73.40 | 75.23 | Faro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench !image/png

transformersggufimatrixFaro-Yi-9B-DPOtext-generationenarxiv:2303.08774license:otherregion:usconversational
duyntnet/faro-yi-9b-dpo-imatrix-gguf visual
Downloads
245
Likes
0
Pipeline
text-generation
Library
transformers
Visibility
Public
Access
Open

Repository Files & Downloads

27 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Faro-Yi-9B-DPO-IQ1_M.gguf GGUF IQ1_M 2.03 GB Download
Faro-Yi-9B-DPO-IQ1_S.gguf GGUF IQ1_S 1.88 GB Download
Faro-Yi-9B-DPO-IQ2_M.gguf GGUF IQ2_M 2.89 GB Download
Faro-Yi-9B-DPO-IQ2_S.gguf GGUF IQ2_S 2.68 GB Download
Faro-Yi-9B-DPO-IQ2_XS.gguf GGUF IQ2_XS 2.52 GB Download
Faro-Yi-9B-DPO-IQ2_XXS.gguf GGUF IQ2_XXS 2.29 GB Download
Faro-Yi-9B-DPO-IQ3_M.gguf GGUF IQ3_M 3.78 GB Download
Faro-Yi-9B-DPO-IQ3_S.gguf GGUF IQ3_S 3.64 GB Download
Faro-Yi-9B-DPO-IQ3_XS.gguf GGUF IQ3_XS 3.46 GB Download
Faro-Yi-9B-DPO-IQ3_XXS.gguf GGUF IQ3_XXS 3.24 GB Download
Faro-Yi-9B-DPO-IQ4_NL.gguf GGUF IQ4_NL 4.70 GB Download
Faro-Yi-9B-DPO-IQ4_XS.gguf GGUF IQ4_XS 4.46 GB Download
Faro-Yi-9B-DPO-Q2_K.gguf GGUF Q2_K 3.12 GB Download
Faro-Yi-9B-DPO-Q2_K_S.gguf GGUF Q2_K_S 2.90 GB Download
Faro-Yi-9B-DPO-Q3_K_L.gguf GGUF Q3_K_L 4.37 GB Download
Faro-Yi-9B-DPO-Q3_K_M.gguf GGUF Q3_K_M 4.03 GB Download
Faro-Yi-9B-DPO-Q3_K_S.gguf GGUF Q3_K_S 3.63 GB Download
Faro-Yi-9B-DPO-Q4_0.gguf GGUF 4.71 GB Download
Faro-Yi-9B-DPO-Q4_1.gguf GGUF 5.19 GB Download
Faro-Yi-9B-DPO-Q4_K_M.gguf GGUF Q4_K_M 4.96 GB Download
Faro-Yi-9B-DPO-Q4_K_S.gguf GGUF Q4_K_S 4.72 GB Download
Faro-Yi-9B-DPO-Q5_0.gguf GGUF 5.70 GB Download
Faro-Yi-9B-DPO-Q5_1.gguf GGUF 6.19 GB Download
Faro-Yi-9B-DPO-Q5_K_M.gguf GGUF Q5_K_M 5.83 GB Download
Faro-Yi-9B-DPO-Q5_K_S.gguf GGUF Q5_K_S 5.69 GB Download
Faro-Yi-9B-DPO-Q6_K.gguf GGUF Q6_K 6.75 GB Download
Faro-Yi-9B-DPO-Q8_0.gguf GGUF 8.74 GB Download

Model Details Live

Model Slug
duyntnet/faro-yi-9b-dpo-imatrix-gguf
Author
duyntnet
Pipeline Task
text-generation
Library
transformers
Created
2025-02-18
Last Modified
2025-02-19
Gated
No
Private
No
HF SHA
e53f8a354976729ce45c00f8bce67fa1c1de6f97
License
other
Language
en
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "other",
    "language": [
      "en"
    ],
    "pipeline_tag": "text-generation",
    "inference": false,
    "tags": [
      "transformers",
      "gguf",
      "imatrix",
      "Faro-Yi-9B-DPO"
    ],
    "frontmatter": {
      "license": "other",
      "language": [
        "en"
      ],
      "pipeline_tag": "text-generation",
      "inference": "false",
      "tags": [
        "transformers",
        "gguf",
        "imatrix",
        "Faro-Yi-9B-DPO"
      ]
    },
    "hero_image_url": "https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/ArlnloL4aPfiiD6kUqaSH.png",
    "summary": "This is the DPO version of wenbopan/Faro-Yi-9B. Compared to Faro-Yi-9B and Yi-9B-200K, the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the Open LLM Leaderboard, it ranks **#2** among all 9B models, **#1** among all Yi-9B variants. | **Metric**              | **MMLU**  | **GSM8K** | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** | | ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- | | **Yi-9B-200K**          | 65.73     | 50.49     | 56.72         | 33.80          | 69.25       | 71.67          | 71.97     | | **Faro-Yi-9B**          | 68.80     | 63.08     | 57.28         | 40.86          | 72.58       | 71.11          | 73.28     | | **Faro-Yi-9B-DPO**      | **69.98** | **66.11** | **59.04**     | **48.01**      | **75.68**   | **73.40**      | **75.23** | Faro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench !image/png",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: other\nlanguage:\n- en\npipeline_tag: text-generation\ninference: false\ntags:\n- transformers\n- gguf\n- imatrix\n- Faro-Yi-9B-DPO\n---\nQuantizations of https://huggingface.co/wenbopan/Faro-Yi-9B-DPO\n\n### Inference Clients/UIs\n* [llama.cpp](https://github.com/ggerganov/llama.cpp)\n* [KoboldCPP](https://github.com/LostRuins/koboldcpp)\n* [ollama](https://github.com/ollama/ollama)\n* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)\n* [jan](https://github.com/janhq/jan)\n* [GPT4All](https://github.com/nomic-ai/gpt4all)\n---\n\n# From original readme\n\nThis is the DPO version of [wenbopan/Faro-Yi-9B](https://huggingface.co/wenbopan/Faro-Yi-9B). Compared to Faro-Yi-9B and [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K), the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin. On the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), it ranks **#2** among all 9B models, **#1** among all Yi-9B variants.\n\n| **Metric**              | **MMLU**  | **GSM8K** | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** |\n| ----------------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- |\n| **Yi-9B-200K**          | 65.73     | 50.49     | 56.72         | 33.80          | 69.25       | 71.67          | 71.97     |\n| **Faro-Yi-9B**          | 68.80     | 63.08     | 57.28         | 40.86          | 72.58       | 71.11          | 73.28     |\n| **Faro-Yi-9B-DPO**      | **69.98** | **66.11** | **59.04**     | **48.01**      | **75.68**   | **73.40**      | **75.23** |\n\nFaro-Yi-9B-DPO's responses are also favored by GPT-4 Judge in MT-Bench\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/ArlnloL4aPfiiD6kUqaSH.png)\n\n## How to Use\n\nFaro-Yi-9B-DPO uses the chatml template and performs well in both short and long contexts. For longer inputs under **24GB of VRAM**, I recommend to use vLLM to have a max prompt of 32K. Setting `kv_cache_dtype=\"fp8_e5m2\"` allows for 48K input length. 4bit-AWQ quantization on top of that can boost input length to 160K, albeit with some performance impact. Adjust `max_model_len` arg in vLLM or `config.json` to avoid OOM.\n\n\n```python\nimport io\nimport requests\nfrom PyPDF2 import PdfReader\nfrom vllm import LLM, SamplingParams\n\nllm = LLM(model=\"wenbopan/Faro-Yi-9B-DPO\", kv_cache_dtype=\"fp8_e5m2\", max_model_len=100000)\n\npdf_data = io.BytesIO(requests.get(\"https://arxiv.org/pdf/2303.08774.pdf\").content)\ndocument = \"\".join(page.extract_text() for page in PdfReader(pdf_data).pages) # 100 pages\n\nquestion = f\"{document}\\n\\nAccording to the paper, what is the parameter count of GPT-4?\"\nmessages = [ {\"role\": \"user\", \"content\": question} ] # 83K tokens\nprompt = llm.get_tokenizer().apply_chat_template(messages, add_generation_prompt=True, tokenize=False)\noutput = llm.generate(prompt, SamplingParams(temperature=0.8, max_tokens=500))\nprint(output[0].outputs[0].text)\n# Yi-9B-200K:      175B. GPT-4 has 175B \\nparameters. How many models were combined to create GPT-4? Answer: 6. ...\n# Faro-Yi-9B: GPT-4 does not have a publicly disclosed parameter count due to the competitive landscape and safety implications of large-scale models like GPT-4. ...\n```\n\n\n<details> <summary>Or With Transformers</summary>\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel = AutoModelForCausalLM.from_pretrained('wenbopan/Faro-Yi-9B-DPO', device_map=\"cuda\")\ntokenizer = AutoTokenizer.from_pretrained('wenbopan/Faro-Yi-9B-DPO')\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant. Always answer with a short response.\"},\n    {\"role\": \"user\", \"content\": \"Tell me what is Pythagorean theorem like you are a pirate.\"}\n]\n\ninput_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors=\"pt\").to(model.device)\ngenerated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)\nresponse = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...\n```\n\n</details>\n",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "imatrix",
    "Faro-Yi-9B-DPO",
    "text-generation",
    "en",
    "arxiv:2303.08774",
    "license:other",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 245,
  "gated": false,
  "private": false,
  "last_modified": "2025-02-19T03:21:07.000Z",
  "created_at": "2025-02-18T23:23:46.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "67b51682d33342e9bca31bcc",
  "id": "duyntnet/Faro-Yi-9B-DPO-imatrix-GGUF",
  "modelId": "duyntnet/Faro-Yi-9B-DPO-imatrix-GGUF",
  "sha": "e53f8a354976729ce45c00f8bce67fa1c1de6f97",
  "createdAt": "2025-02-18T23:23:46.000Z",
  "lastModified": "2025-02-19T03:21:07.000Z",
  "author": "duyntnet",
  "downloads": 245,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 29
}