GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf overview

MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.

ggufarxiv:2403.02712endpoints_compatibleregion:us
richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf visual
Downloads
521
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

22 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Breeze-7B-32k-Base-v1_0.IQ3_M.gguf GGUF IQ3_M 3.21 GB Download
Breeze-7B-32k-Base-v1_0.IQ3_S.gguf GGUF IQ3_S 3.11 GB Download
Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf GGUF IQ3_XS 2.96 GB Download
Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf GGUF IQ4_NL 4.03 GB Download
Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf GGUF IQ4_XS 3.83 GB Download
Breeze-7B-32k-Base-v1_0.Q2_K.gguf GGUF Q2_K 2.67 GB Download
Breeze-7B-32k-Base-v1_0.Q3_K.gguf GGUF Q3_K 3.42 GB Download
Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf GGUF Q3_K_L 3.71 GB Download
Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf GGUF Q3_K_M 3.42 GB Download
Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf GGUF Q3_K_S 3.09 GB Download
Breeze-7B-32k-Base-v1_0.Q4_0.gguf GGUF 3.99 GB Download
Breeze-7B-32k-Base-v1_0.Q4_1.gguf GGUF 4.41 GB Download
Breeze-7B-32k-Base-v1_0.Q4_K.gguf GGUF Q4_K 4.23 GB Download
Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf GGUF Q4_K_M 4.23 GB Download
Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf GGUF Q4_K_S 4.02 GB Download
Breeze-7B-32k-Base-v1_0.Q5_0.gguf GGUF 4.83 GB Download
Breeze-7B-32k-Base-v1_0.Q5_1.gguf GGUF 5.25 GB Download
Breeze-7B-32k-Base-v1_0.Q5_K.gguf GGUF Q5_K 4.95 GB Download
Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf GGUF Q5_K_M 4.95 GB Download
Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf GGUF Q5_K_S 4.83 GB Download
Breeze-7B-32k-Base-v1_0.Q6_K.gguf GGUF Q6_K 5.73 GB Download
Breeze-7B-32k-Base-v1_0.Q8_0.gguf GGUF 7.41 GB Download

Model Details Live

Model Slug
richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf
Author
RichardErkhov
Pipeline Task
Library
Created
2024-09-18
Last Modified
2024-09-18
Gated
No
Private
No
HF SHA
0ab9dc3082b94df989969ea8246bc300380c47ae
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/resolve/main/needle-in-a-haystack-performance.png",
    "summary": "MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nBreeze-7B-32k-Base-v1_0 - GGUF\n- Model creator: https://huggingface.co/MediaTek-Research/\n- Original model: https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Breeze-7B-32k-Base-v1_0.Q2_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q2_K.gguf) | Q2_K | 2.67GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf) | IQ3_XS | 2.96GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_S.gguf) | IQ3_S | 3.11GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf) | Q3_K_S | 3.09GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_M.gguf) | IQ3_M | 3.21GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K.gguf) | Q3_K | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf) | Q3_K_M | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf) | Q3_K_L | 3.71GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf) | IQ4_XS | 3.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_0.gguf) | Q4_0 | 3.99GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf) | IQ4_NL | 4.03GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf) | Q4_K_S | 4.02GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K.gguf) | Q4_K | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf) | Q4_K_M | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_1.gguf) | Q4_1 | 4.41GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_0.gguf) | Q5_0 | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf) | Q5_K_S | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K.gguf) | Q5_K | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf) | Q5_K_M | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_1.gguf) | Q5_1 | 5.25GB |\n| [Breeze-7B-32k-Base-v1_0.Q6_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q6_K.gguf) | Q6_K | 5.73GB |\n| [Breeze-7B-32k-Base-v1_0.Q8_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q8_0.gguf) | Q8_0 | 7.41GB |\n\n\n\n\nOriginal model description:\n---\npipeline_tag: text-generation\nlicense: apache-2.0\nlanguage:\n- zh\n- en\n---\n\n# Model Card for MediaTek Research Breeze-7B-32k-Base-v1_0\n\nMediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1), specifically intended for Traditional Chinese use.\n\n[Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0) is the base model for the Breeze-7B series. \nIt is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.\n\n[Breeze-7B-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0) derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n[Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0) is extended from the base model with more data, base change, and the disabling of the sliding window. \nRoughly speaking, that is equivalent to 44k Traditional Chinese characters.\n\n[Breeze-7B-32k-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Instruct-v1_0) derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n\n\nPracticality-wise:\n- Breeze-7B-Base expands the original vocabulary with additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See [Inference Performance](#inference-performance).]\n- Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.\n- Breeze-7B-32k-Instruct can perform tasks at a document level (For Chinese, 20 ~ 40 pages).\n\n*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*\n\n## Features\n\n- Breeze-7B-32k-Base-v1_0\n  - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n  - 32k-token context length\n \n- Breeze-7B-32k-Instruct-v1_0\n  - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n  - 32k-token context length\n  - Multi-turn dialogue (without special handling for harmfulness)\n\n## Model Details\n\n- Breeze-7B-32k-Base-v1_0\n  - Pretrained from: [Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0)\n  - Model type: Causal decoder-only transformer language model\n  - Language: English and Traditional Chinese (zh-tw)\n- Breeze-7B-32k-Instruct-v1_0\n  - Finetuned from: [Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0)\n  - Model type: Causal decoder-only transformer language model\n  - Language: English and Traditional Chinese (zh-tw)\n\n## Long-context Performance\n\n#### Needle-in-a-haystack Performance\n\nWe use the passkey retrieval task to test the model's ability to attend to different various depths in a given sequence.\nA key in placed within a long context distracting document for the model to retrieve.\nThe key position is binned into 16 bins, and there are 20 testcases for each bin.\nBreeze-7B-32k-Base clears the tasks with 90+% accuracy, shown in the figure below.\n![Needle-in-a-haystack Performance](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/resolve/main/needle-in-a-haystack-performance.png)\n\n#### Long-DRCD Performance\n\n| **Model/Performance(EM)** | **DRCD** | **DRCD-16k** | **DRCD-32k** |\n|---------------------------|----------|--------------|--------------|\n| **Breeze-7B-32k-Instruct-v1\\_0** |  76.9   |   54.82   |  44.26     |       \n| **Breeze-7B-32k-Base-v1\\_0** | 79.73    | 69.68        | 61.55        |\n| **Breeze-7B-Base-v1\\_0**      | 80.61    | 21.79        | 15.29        | \n\n#### Short-Benchmark Performance\n\n| **Model/Performance(EM)** | **TMMLU+** | **MMLU** | **TABLE** | **MT-Bench-tw** | **MT-Bench** |\n|---------------------------|----------|--------------|--------------|-----|-----|\n| **Breeze-7B-32k-Instruct-v1\\_0** | 41.37    |   61.34   |   34    | 5.8 | 7.4 |\n| **Breeze-7B-Instruct-v1\\_0**      |  42.67  |   62.73    |   39.58 | 6.0 | 7.4 |\n\n## Use in Transformers\n\nFirst, install direct dependencies:\n```\npip install transformers torch accelerate\n```\n<p style=\"color:red;\">Flash-attention2 is strongly recommended for long context scenarios.</p>\n\n```bash\npip install packaging ninja\npip install flash-attn\n```\nThen load the model in transformers:\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    \"MediaTek-Research/Breeze-7B-32k-Base-v1_0\",\n    device_map=\"auto\",\n    torch_dtype=torch.bfloat16,\n    attn_implementation=\"flash_attention_2\" # optional but highly recommended\n)\nfrom transformers import AutoTokenizer\ntokenizer = AutoTokenizer.from_pretrained(\"MediaTek-Research/Breeze-7B-32k-Base-v1_0\")\ntokenizer.tokenize(\"你好,我可以幫助您解決各種問題、提供資訊和協助您完成許多不同的任務。例如:回答技術問題、提供建議、翻譯文字、尋找資料或協助您安排行程等。請告訴我如何能幫助您。\")\n# Tokenized results\n# ['▁', '你好', ',', '我', '可以', '幫助', '您', '解決', '各種', '問題', '、', '提供', '資訊', '和', '協助', '您', '完成', '許多', '不同', '的', '任務', '。', '例如', ':', '回答', '技術', '問題', '、', '提供', '建議', '、', '翻譯', '文字', '、', '尋找', '資料', '或', '協助', '您', '安排', '行程', '等', '。', '請', '告訴', '我', '如何', '能', '幫助', '您', '。']\n```\n\n\n## Citation\n\n```\n@article{MediaTek-Research2024breeze7b,\n      title={Breeze-7B Technical Report}, \n      author={Chan-Jan Hsu and Chang-Le Liu and Feng-Ting Liao and Po-Chun Hsu and Yi-Chang Chen and Da-Shan Shiu},\n      year={2024},\n      eprint={2403.02712},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2403.02712",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 0,
  "downloads": 521,
  "gated": false,
  "private": false,
  "last_modified": "2024-09-18T16:45:46.000Z",
  "created_at": "2024-09-18T11:15:37.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "66eab659bcfa5271dcd4fc54",
  "id": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
  "modelId": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
  "sha": "0ab9dc3082b94df989969ea8246bc300380c47ae",
  "createdAt": "2024-09-18T11:15:37.000Z",
  "lastModified": "2024-09-18T16:45:46.000Z",
  "author": "RichardErkhov",
  "downloads": 521,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}