Model Intelligence Sheet

richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf overview

MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.

ggufarxiv:2403.02712endpoints_compatibleregion:us

richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf visual

Downloads

521

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

22 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Breeze-7B-32k-Base-v1_0.IQ3_M.gguf	GGUF	IQ3_M	3.21 GB	Download
Breeze-7B-32k-Base-v1_0.IQ3_S.gguf	GGUF	IQ3_S	3.11 GB	Download
Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf	GGUF	IQ3_XS	2.96 GB	Download
Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf	GGUF	IQ4_NL	4.03 GB	Download
Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf	GGUF	IQ4_XS	3.83 GB	Download
Breeze-7B-32k-Base-v1_0.Q2_K.gguf	GGUF	Q2_K	2.67 GB	Download
Breeze-7B-32k-Base-v1_0.Q3_K.gguf	GGUF	Q3_K	3.42 GB	Download
Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf	GGUF	Q3_K_L	3.71 GB	Download
Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf	GGUF	Q3_K_M	3.42 GB	Download
Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf	GGUF	Q3_K_S	3.09 GB	Download
Breeze-7B-32k-Base-v1_0.Q4_0.gguf	GGUF	—	3.99 GB	Download
Breeze-7B-32k-Base-v1_0.Q4_1.gguf	GGUF	—	4.41 GB	Download
Breeze-7B-32k-Base-v1_0.Q4_K.gguf	GGUF	Q4_K	4.23 GB	Download
Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf	GGUF	Q4_K_M	4.23 GB	Download
Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf	GGUF	Q4_K_S	4.02 GB	Download
Breeze-7B-32k-Base-v1_0.Q5_0.gguf	GGUF	—	4.83 GB	Download
Breeze-7B-32k-Base-v1_0.Q5_1.gguf	GGUF	—	5.25 GB	Download
Breeze-7B-32k-Base-v1_0.Q5_K.gguf	GGUF	Q5_K	4.95 GB	Download
Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf	GGUF	Q5_K_M	4.95 GB	Download
Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf	GGUF	Q5_K_S	4.83 GB	Download
Breeze-7B-32k-Base-v1_0.Q6_K.gguf	GGUF	Q6_K	5.73 GB	Download
Breeze-7B-32k-Base-v1_0.Q8_0.gguf	GGUF	—	7.41 GB	Download

Model Details Live

Model Slug

richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf

Author

RichardErkhov

Pipeline Task

—

Library

—

Created

2024-09-18

Last Modified

2024-09-18

Gated

Private

HF SHA

0ab9dc3082b94df989969ea8246bc300380c47ae

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/resolve/main/needle-in-a-haystack-performance.png",
    "summary": "MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nBreeze-7B-32k-Base-v1_0 - GGUF\n- Model creator: https://huggingface.co/MediaTek-Research/\n- Original model: https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Breeze-7B-32k-Base-v1_0.Q2_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q2_K.gguf) | Q2_K | 2.67GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf) | IQ3_XS | 2.96GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_S.gguf) | IQ3_S | 3.11GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf) | Q3_K_S | 3.09GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_M.gguf) | IQ3_M | 3.21GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K.gguf) | Q3_K | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf) | Q3_K_M | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf) | Q3_K_L | 3.71GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf) | IQ4_XS | 3.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_0.gguf) | Q4_0 | 3.99GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf) | IQ4_NL | 4.03GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf) | Q4_K_S | 4.02GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K.gguf) | Q4_K | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf) | Q4_K_M | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_1.gguf) | Q4_1 | 4.41GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_0.gguf) | Q5_0 | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf) | Q5_K_S | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K.gguf) | Q5_K | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf) | Q5_K_M | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_1.gguf) | Q5_1 | 5.25GB |\n| [Breeze-7B-32k-Base-v1_0.Q6_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q6_K.gguf) | Q6_K | 5.73GB |\n| [Breeze-7B-32k-Base-v1_0.Q8_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q8_0.gguf) | Q8_0 | 7.41GB |\n\n\n\n\nOriginal model description:\n---\npipeline_tag: text-generation\nlicense: apache-2.0\nlanguage:\n- zh\n- en\n---\n\n# Model Card for MediaTek Research Breeze-7B-32k-Base-v1_0\n\nMediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1), specifically intended for Traditional Chinese use.\n\n[Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0) is the base model for the Breeze-7B series. \nIt is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.\n\n[Breeze-7B-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0) derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n[Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0) is extended from the base model with more data, base change, and the disabling of the sliding window. \nRoughly speaking, that is equivalent to 44k Traditional Chinese characters.\n\n[Breeze-7B-32k-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Instruct-v1_0) derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n\n\nPracticality-wise:\n- Breeze-7B-Base expands the original vocabulary with additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See [Inference Performance](#inference-performance).]\n- Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.\n- Breeze-7B-32k-Instruct can perform tasks at a document level (For Chinese, 20 ~ 40 pages).\n\n*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*\n\n## Features\n\n- Breeze-7B-32k-Base-v1_0\n  - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n  - 32k-token context length\n \n- Breeze-7B-32k-Instruct-v1_0\n  - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n  - 32k-token context length\n  - Multi-turn dialogue (without special handling for harmfulness)\n\n## Model Details\n\n- Breeze-7B-32k-Base-v1_0\n  - Pretrained from: [Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0)\n  - Model type: Causal decoder-only transformer language model\n  - Language: English and Traditional Chinese (zh-tw)\n- Breeze-7B-32k-Instruct-v1_0\n  - Finetuned from: [Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0)\n  - Model type: Causal decoder-only transformer language model\n  - Language: English and Traditional Chinese (zh-tw)\n\n## Long-context Performance\n\n#### Needle-in-a-haystack Performance\n\nWe use the passkey retrieval task to test the model's ability to attend to different various depths in a given sequence.\nA key in placed within a long context distracting document for the model to retrieve.\nThe key position is binned into 16 bins, and there are 20 testcases for each bin.\nBreeze-7B-32k-Base clears the tasks with 90+% accuracy, shown in the figure below.\n![Needle-in-a-haystack Performance](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/resolve/main/needle-in-a-haystack-performance.png)\n\n#### Long-DRCD Performance\n\n| **Model/Performance(EM)** | **DRCD** | **DRCD-16k** | **DRCD-32k** |\n|---------------------------|----------|--------------|--------------|\n| **Breeze-7B-32k-Instruct-v1\\_0** |  76.9   |   54.82   |  44.26     |       \n| **Breeze-7B-32k-Base-v1\\_0** | 79.73    | 69.68        | 61.55        |\n| **Breeze-7B-Base-v1\\_0**      | 80.61    | 21.79        | 15.29        | \n\n#### Short-Benchmark Performance\n\n| **Model/Performance(EM)** | **TMMLU+** | **MMLU** | **TABLE** | **MT-Bench-tw** | **MT-Bench** |\n|---------------------------|----------|--------------|--------------|-----|-----|\n| **Breeze-7B-32k-Instruct-v1\\_0** | 41.37    |   61.34   |   34    | 5.8 | 7.4 |\n| **Breeze-7B-Instruct-v1\\_0**      |  42.67  |   62.73    |   39.58 | 6.0 | 7.4 |\n\n## Use in Transformers\n\nFirst, install direct dependencies:\n```\npip install transformers torch accelerate\n```\n<p style=\"color:red;\">Flash-attention2 is strongly recommended for long context scenarios.</p>\n\n```bash\npip install packaging ninja\npip install flash-attn\n```\nThen load the model in transformers:\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    \"MediaTek-Research/Breeze-7B-32k-Base-v1_0\",\n    device_map=\"auto\",\n    torch_dtype=torch.bfloat16,\n    attn_implementation=\"flash_attention_2\" # optional but highly recommended\n)\nfrom transformers import AutoTokenizer\ntokenizer = AutoTokenizer.from_pretrained(\"MediaTek-Research/Breeze-7B-32k-Base-v1_0\")\ntokenizer.tokenize(\"你好，我可以幫助您解決各種問題、提供資訊和協助您完成許多不同的任務。例如：回答技術問題、提供建議、翻譯文字、尋找資料或協助您安排行程等。請告訴我如何能幫助您。\")\n# Tokenized results\n# ['▁', '你好', '，', '我', '可以', '幫助', '您', '解決', '各種', '問題', '、', '提供', '資訊', '和', '協助', '您', '完成', '許多', '不同', '的', '任務', '。', '例如', '：', '回答', '技術', '問題', '、', '提供', '建議', '、', '翻譯', '文字', '、', '尋找', '資料', '或', '協助', '您', '安排', '行程', '等', '。', '請', '告訴', '我', '如何', '能', '幫助', '您', '。']\n```\n\n\n## Citation\n\n```\n@article{MediaTek-Research2024breeze7b,\n      title={Breeze-7B Technical Report}, \n      author={Chan-Jan Hsu and Chang-Le Liu and Feng-Ting Liao and Po-Chun Hsu and Yi-Chang Chen and Da-Shan Shiu},\n      year={2024},\n      eprint={2403.02712},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2403.02712",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 0,
  "downloads": 521,
  "gated": false,
  "private": false,
  "last_modified": "2024-09-18T16:45:46.000Z",
  "created_at": "2024-09-18T11:15:37.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "66eab659bcfa5271dcd4fc54",
  "id": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
  "modelId": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
  "sha": "0ab9dc3082b94df989969ea8246bc300380c47ae",
  "createdAt": "2024-09-18T11:15:37.000Z",
  "lastModified": "2024-09-18T16:45:46.000Z",
  "author": "RichardErkhov",
  "downloads": 521,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}