GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

nisten/dolphin-2.8-7b-imatrix-gguf overview

This repository contains iMatrix quantizations of the dolphin-2.8-mistral-7b-v02 model. The original model was trained with 16k long context data on top of a newer mistral-7b, enabling it to work well with up to 32k context. The iMatrix file was generated using the wiki.train.raw dataset, which took a few hours to process. We have also included the wiki.test.raw file for perplexity testing.

gguflicense:apache-2.0endpoints_compatibleregion:usconversational
nisten/dolphin-2.8-7b-imatrix-gguf visual
Downloads
117
Likes
9
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

18 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
dolphin1s.gguf GGUF 2.31 GB Download
dolphin2k.gguf GGUF 3.22 GB Download
dolphin2ks.gguf GGUF 2.81 GB Download
dolphin2m.gguf GGUF 2.84 GB Download
dolphin2s.gguf GGUF 2.69 GB Download
dolphin2xxs.gguf GGUF 2.53 GB Download
dolphin3m.gguf GGUF 3.40 GB Download
dolphin3s.gguf GGUF 3.44 GB Download
dolphin3xs.gguf GGUF 3.63 GB Download
dolphin3xxs.gguf GGUF 3.09 GB Download
dolphin4km.gguf GGUF 4.28 GB Download
dolphin4nl.gguf GGUF 4.19 GB Download
dolphin4xs.gguf GGUF 4.02 GB Download
dolphin5km.gguf GGUF 4.90 GB Download
dolphin5ks.gguf GGUF 4.88 GB Download
dolphin6k.gguf GGUF 5.63 GB Download
dolphin8bit.gguf GGUF 7.17 GB Download
dolphinf16.gguf GGUF F16 13.49 GB Download

Model Details Live

Model Slug
nisten/dolphin-2.8-7b-imatrix-gguf
Author
nisten
Pipeline Task
Library
Created
2024-04-05
Last Modified
2024-04-06
Gated
No
Private
No
HF SHA
dc05bf9afdd6219a21ac279cf20aad0927afb1ca
License
apache-2.0
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "license": "apache-2.0",
    "frontmatter": {
      "license": "apache-2.0"
    },
    "hero_image_url": "",
    "summary": "This repository contains iMatrix quantizations of the dolphin-2.8-mistral-7b-v02 model. The original model was trained with 16k long context data on top of a newer mistral-7b, enabling it to work well with up to 32k context. The iMatrix file was generated using the wiki.train.raw dataset, which took a few hours to process. We have also included the wiki.test.raw file for perplexity testing.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nlicense: apache-2.0\n---\n\n\n# Dolphin-2.8-Mistral-7B-v2 iMatrix Quantizations\n\nThis repository contains iMatrix quantizations of the [dolphin-2.8-mistral-7b-v02](https://huggingface.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02) model. The original model was trained with 16k long context data on top of a newer mistral-7b, enabling it to work well with up to 32k context.\n\nThe iMatrix file was generated using the `wiki.train.raw` dataset, which took a few hours to process. We have also included the `wiki.test.raw` file for perplexity testing.\n\n## Quantization Benefits\n\nYou'll notice that these quantizations are slightly larger compared to others, but they offer much lower perplexity. For example, the 2s 2-bit mixed models are very usable due to this custom quantization and don't lose much perplexity compared to the full f16 model.\n\n## Notes\n\n- The 8-bit weight is **not** iMatrix quantized (although it wouldn't make a significant difference). It can be used as a reference perplexity measurement along with `dolphinf16`.\n- All other models, including the 4k variants, have been quantized with iMatrix and should exhibit better perplexity performance compared to regular k quantizations.\n- iMatrix quantization can be applied to all k quantizations, not just the i ones.\n- 1bit quant gives garbage, but all else, including 2xxs are suprisingly very coherent\n\n## Perplexity values\n\n```./perplexity -m dolphin2m.gguf -f wiki.test.raw -ngl 34```\n\n```bash\ndolphinf16.gguf perplexity - [1]4.3052,[2]4.8421,[3]5.7401,[4]6.6554,[5]6.6552,[6]6.6580,[7]6.9198,[8]7.0918,[9]7.2503,[10]7.5712,[11]7.8367,[12]7.8476,\nFinal estimate: PPL = 7.8476 +/- 0.35984    THIS IS BASELINE \n\ndolphin1bit.gguf perplexity - [1]59477.7292,[2]50746.4580,[3]53932.3131,[4]55797.8433,[5]45995.5032,[6]46595.4234,[7]45130.6779,[8]40769.8593,[9]41322.7842,[10]50644.7393,[11]50676.5808,[12]51939.5094,\nFinal estimate: PPL = 51939.5094 +/- 1339.29301     1BIT GIVES GARBAGE OUTPUT\n\ndolphin2xxs.gguf perplexity - [1]5.4651,[2]6.7941,[3]7.8700,[4]8.7155,[5]8.3566,[6]8.3316,[7]8.6121,[8]8.7565,[9]8.9041,[10]9.3572,[11]9.6426,[12]9.5626,\nFinal estimate: PPL = 9.5626 +/- 0.43895    9.5 vs 7.8 at f16, means lossy but coherent\n\ndolphin2s.gguf perplexity - [1]5.0014,[2]5.9477,[3]6.8424,[4]7.6348,[5]7.4755,[6]7.4667,[7]7.7625,[8]7.8807,[9]8.0374,[10]8.4086,[11]8.6475,[12]8.6427,\nFinal estimate: PPL = 8.6427 +/- 0.39501\n\ndolphin2m.gguf perplexity - [1]4.5874,[2]5.3203,[3]6.2334,[4]7.1444,[5]7.1188,[6]7.1422,[7]7.4717,[8]7.6180,[9]7.7948,[10]8.1319,[11]8.3747,[12]8.4095,\nFinal estimate: PPL = 8.4095 +/- 0.38329\n\ndolphin2k.gguf perplexity - [1]4.6331,[2]5.2648,[3]6.0493,[4]7.0165,[5]6.9300,[6]6.9177,[7]7.2362,[8]7.4417,[9]7.6292,[10]7.9640,[11]8.2121,[12]8.1930,\nFinal estimate: PPL = 8.1930 +/- 0.37241\n\ndolphin2ks.gguf perplexity - [1]4.7995,[2]5.6653,[3]6.4331,[4]7.3841,[5]7.2724,[6]7.3161,[7]7.6567,[8]7.8423,[9]8.0129,[10]8.4033,[11]8.6636,[12]8.6391,\nFinal estimate: PPL = 8.6391 +/- 0.39315\n\ndolphin3s.gguf perplexity - [1]4.3574,[2]4.9936,[3]5.8814,[4]6.8093,[5]6.8086,[6]6.7949,[7]7.0638,[8]7.2204,[9]7.3844,[10]7.6895,[11]7.9489,[12]7.9527,\nFinal estimate: PPL = 7.9527 +/- 0.36202\n\ndolphin3xs.gguf perplexity - [1]4.3161,[2]4.9579,[3]5.8647,[4]6.8064,[5]6.7614,[6]6.7501,[7]7.0133,[8]7.2103,[9]7.3862,[10]7.7265,[11]7.9813,[12]7.9780,\nFinal estimate: PPL = 7.9780 +/- 0.36655\n\ndolphin3xxs.gguf perplexity - [1]4.5418,[2]5.0902,[3]6.0117,[4]6.9852,[5]6.9329,[6]6.9165,[7]7.1853,[8]7.3359,[9]7.4923,[10]7.8122,[11]8.0696,[12]8.0592,\nFinal estimate: PPL = 8.0592 +/- 0.36502\n\ndolphin3m.gguf perplexity  - [1]4.3203,[2]4.9566,[3]5.8151,[4]6.7619,[5]6.7801,[6]6.7762,[7]7.0351,[8]7.2054,[9]7.3766,[10]7.6896,[11]7.9580,[12]7.9660,\nFinal estimate: PPL = 7.9660 +/- 0.36234\n\ndolphin4km.gguf perplexity - [1]4.3331,[2]4.9129,[3]5.7915,[4]6.7030,[5]6.6921,[6]6.6978,[7]6.9570,[8]7.1284,[9]7.2854,[10]7.6098,[11]7.8696,[12]7.8767,\nFinal estimate: PPL = 7.8767 +/- 0.35875\n\ndolphin4nl.gguf perplexity - [1]4.2682,[2]4.8494,[3]5.7530,[4]6.6890,[5]6.6672,[6]6.6637,[7]6.9332,[8]7.1126,[9]7.2821,[10]7.5998,[11]7.8733,[12]7.8875,\nFinal estimate: PPL = 7.8875 +/- 0.36227\n\ndolphin4xs.gguf perplexity - [1]4.2986,[2]4.8610,[3]5.7658,[4]6.6906,[5]6.6621,[6]6.6608,[7]6.9321,[8]7.1140,[9]7.2892,[10]7.6085,[11]7.8806,[12]7.8921,\nFinal estimate: PPL = 7.8921 +/- 0.36258\n\ndolphin5ks.gguf perplexity - [1]4.2557,[2]4.8249,[3]5.7413,[4]6.6671,[5]6.6611,[6]6.6686,[7]6.9389,[8]7.1079,[9]7.2707,[10]7.5962,[11]7.8529,[12]7.8627,\nFinal estimate: PPL = 7.8627 +/- 0.36124\n\ndolphin5km.gguf perplexity - [1]4.3191,[2]4.8597,[3]5.7844,[4]6.7120,[5]6.6994,[6]6.6964,[7]6.9569,[8]7.1215,[9]7.2792,[10]7.6109,[11]7.8682,[12]7.8794,\nFinal estimate: PPL = 7.8794 +/- 0.36185\n\ndolphin6k.gguf perplexity - [1]4.3264,[2]4.8531,[3]5.7574,[4]6.6741,[5]6.6707,[6]6.6795,[7]6.9362,[8]7.1076,[9]7.2678,[10]7.5864,[11]7.8496,[12]7.8628,\nFinal estimate: PPL = 7.8628 +/- 0.36075\n\ndolphin8bit.gguf perplxity - [1]4.3063,[2]4.8463,[3]5.7347,[4]6.6499,[5]6.6471,[6]6.6531,[7]6.9160,[8]7.0899,[9]7.2509,[10]7.5705,[11]7.8357,[12]7.8466,\nFinal estimate: PPL = 7.8466 +/- 0.35948\n```\n\n\nAs we can see 2bit xxs with this method actually is surprisingly coherent.",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 9,
  "downloads": 117,
  "gated": false,
  "private": false,
  "last_modified": "2024-04-06T00:38:14.000Z",
  "created_at": "2024-04-05T21:47:46.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "6610718294e0b3bff36eb367",
  "id": "nisten/dolphin-2.8-7b-imatrix-gguf",
  "modelId": "nisten/dolphin-2.8-7b-imatrix-gguf",
  "sha": "dc05bf9afdd6219a21ac279cf20aad0927afb1ca",
  "createdAt": "2024-04-05T21:47:46.000Z",
  "lastModified": "2024-04-06T00:38:14.000Z",
  "author": "nisten",
  "downloads": 117,
  "likes": 9,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}