richarderkhov/mediatek-research_-_breeze-7b-32k-base-v1_0-gguf overview
MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Breeze-7B-32k-Base-v1_0.IQ3_M.gguf | GGUF | IQ3_M | 3.21 GB | Download |
| Breeze-7B-32k-Base-v1_0.IQ3_S.gguf | GGUF | IQ3_S | 3.11 GB | Download |
| Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf | GGUF | IQ3_XS | 2.96 GB | Download |
| Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf | GGUF | IQ4_NL | 4.03 GB | Download |
| Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf | GGUF | IQ4_XS | 3.83 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q2_K.gguf | GGUF | Q2_K | 2.67 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q3_K.gguf | GGUF | Q3_K | 3.42 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf | GGUF | Q3_K_L | 3.71 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf | GGUF | Q3_K_M | 3.42 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf | GGUF | Q3_K_S | 3.09 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q4_0.gguf | GGUF | — | 3.99 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q4_1.gguf | GGUF | — | 4.41 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q4_K.gguf | GGUF | Q4_K | 4.23 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf | GGUF | Q4_K_M | 4.23 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf | GGUF | Q4_K_S | 4.02 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q5_0.gguf | GGUF | — | 4.83 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q5_1.gguf | GGUF | — | 5.25 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q5_K.gguf | GGUF | Q5_K | 4.95 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf | GGUF | Q5_K_M | 4.95 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf | GGUF | Q5_K_S | 4.83 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q6_K.gguf | GGUF | Q6_K | 5.73 GB | Download |
| Breeze-7B-32k-Base-v1_0.Q8_0.gguf | GGUF | — | 7.41 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"frontmatter": {},
"hero_image_url": "https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/resolve/main/needle-in-a-haystack-performance.png",
"summary": "MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case. Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Breeze-7B-32k-Base is extended from the base model with more data, base change, and the disabling of the sliding window. Roughly speaking, that is equivalent to 44k Traditional Chinese characters. Breeze-7B-32k-Instruct derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks. Practicality-wise: *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nBreeze-7B-32k-Base-v1_0 - GGUF\n- Model creator: https://huggingface.co/MediaTek-Research/\n- Original model: https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Breeze-7B-32k-Base-v1_0.Q2_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q2_K.gguf) | Q2_K | 2.67GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_XS.gguf) | IQ3_XS | 2.96GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_S.gguf) | IQ3_S | 3.11GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_S.gguf) | Q3_K_S | 3.09GB |\n| [Breeze-7B-32k-Base-v1_0.IQ3_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ3_M.gguf) | IQ3_M | 3.21GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K.gguf) | Q3_K | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_M.gguf) | Q3_K_M | 3.42GB |\n| [Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q3_K_L.gguf) | Q3_K_L | 3.71GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_XS.gguf) | IQ4_XS | 3.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_0.gguf) | Q4_0 | 3.99GB |\n| [Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.IQ4_NL.gguf) | IQ4_NL | 4.03GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_S.gguf) | Q4_K_S | 4.02GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K.gguf) | Q4_K | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_K_M.gguf) | Q4_K_M | 4.23GB |\n| [Breeze-7B-32k-Base-v1_0.Q4_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q4_1.gguf) | Q4_1 | 4.41GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_0.gguf) | Q5_0 | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_S.gguf) | Q5_K_S | 4.83GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K.gguf) | Q5_K | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_K_M.gguf) | Q5_K_M | 4.95GB |\n| [Breeze-7B-32k-Base-v1_0.Q5_1.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q5_1.gguf) | Q5_1 | 5.25GB |\n| [Breeze-7B-32k-Base-v1_0.Q6_K.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q6_K.gguf) | Q6_K | 5.73GB |\n| [Breeze-7B-32k-Base-v1_0.Q8_0.gguf](https://huggingface.co/RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf/blob/main/Breeze-7B-32k-Base-v1_0.Q8_0.gguf) | Q8_0 | 7.41GB |\n\n\n\n\nOriginal model description:\n---\npipeline_tag: text-generation\nlicense: apache-2.0\nlanguage:\n- zh\n- en\n---\n\n# Model Card for MediaTek Research Breeze-7B-32k-Base-v1_0\n\nMediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1), specifically intended for Traditional Chinese use.\n\n[Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0) is the base model for the Breeze-7B series. \nIt is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.\n\n[Breeze-7B-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0) derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n[Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0) is extended from the base model with more data, base change, and the disabling of the sliding window. \nRoughly speaking, that is equivalent to 44k Traditional Chinese characters.\n\n[Breeze-7B-32k-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Instruct-v1_0) derives from the base model Breeze-7B-32k-Base, making the resulting model amenable to be used as-is for commonly seen tasks.\n\n\n\nPracticality-wise:\n- Breeze-7B-Base expands the original vocabulary with additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See [Inference Performance](#inference-performance).]\n- Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.\n- Breeze-7B-32k-Instruct can perform tasks at a document level (For Chinese, 20 ~ 40 pages).\n\n*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*\n\n## Features\n\n- Breeze-7B-32k-Base-v1_0\n - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n - 32k-token context length\n \n- Breeze-7B-32k-Instruct-v1_0\n - Expanding the vocabulary dictionary size from 32k to 62k to better support Traditional Chinese\n - 32k-token context length\n - Multi-turn dialogue (without special handling for harmfulness)\n\n## Model Details\n\n- Breeze-7B-32k-Base-v1_0\n - Pretrained from: [Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0)\n - Model type: Causal decoder-only transformer language model\n - Language: English and Traditional Chinese (zh-tw)\n- Breeze-7B-32k-Instruct-v1_0\n - Finetuned from: [Breeze-7B-32k-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-32k-Base-v1_0)\n - Model type: Causal decoder-only transformer language model\n - Language: English and Traditional Chinese (zh-tw)\n\n## Long-context Performance\n\n#### Needle-in-a-haystack Performance\n\nWe use the passkey retrieval task to test the model's ability to attend to different various depths in a given sequence.\nA key in placed within a long context distracting document for the model to retrieve.\nThe key position is binned into 16 bins, and there are 20 testcases for each bin.\nBreeze-7B-32k-Base clears the tasks with 90+% accuracy, shown in the figure below.\n\n\n#### Long-DRCD Performance\n\n| **Model/Performance(EM)** | **DRCD** | **DRCD-16k** | **DRCD-32k** |\n|---------------------------|----------|--------------|--------------|\n| **Breeze-7B-32k-Instruct-v1\\_0** | 76.9 | 54.82 | 44.26 | \n| **Breeze-7B-32k-Base-v1\\_0** | 79.73 | 69.68 | 61.55 |\n| **Breeze-7B-Base-v1\\_0** | 80.61 | 21.79 | 15.29 | \n\n#### Short-Benchmark Performance\n\n| **Model/Performance(EM)** | **TMMLU+** | **MMLU** | **TABLE** | **MT-Bench-tw** | **MT-Bench** |\n|---------------------------|----------|--------------|--------------|-----|-----|\n| **Breeze-7B-32k-Instruct-v1\\_0** | 41.37 | 61.34 | 34 | 5.8 | 7.4 |\n| **Breeze-7B-Instruct-v1\\_0** | 42.67 | 62.73 | 39.58 | 6.0 | 7.4 |\n\n## Use in Transformers\n\nFirst, install direct dependencies:\n```\npip install transformers torch accelerate\n```\n<p style=\"color:red;\">Flash-attention2 is strongly recommended for long context scenarios.</p>\n\n```bash\npip install packaging ninja\npip install flash-attn\n```\nThen load the model in transformers:\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\nmodel = AutoModelForCausalLM.from_pretrained(\n \"MediaTek-Research/Breeze-7B-32k-Base-v1_0\",\n device_map=\"auto\",\n torch_dtype=torch.bfloat16,\n attn_implementation=\"flash_attention_2\" # optional but highly recommended\n)\nfrom transformers import AutoTokenizer\ntokenizer = AutoTokenizer.from_pretrained(\"MediaTek-Research/Breeze-7B-32k-Base-v1_0\")\ntokenizer.tokenize(\"你好,我可以幫助您解決各種問題、提供資訊和協助您完成許多不同的任務。例如:回答技術問題、提供建議、翻譯文字、尋找資料或協助您安排行程等。請告訴我如何能幫助您。\")\n# Tokenized results\n# ['▁', '你好', ',', '我', '可以', '幫助', '您', '解決', '各種', '問題', '、', '提供', '資訊', '和', '協助', '您', '完成', '許多', '不同', '的', '任務', '。', '例如', ':', '回答', '技術', '問題', '、', '提供', '建議', '、', '翻譯', '文字', '、', '尋找', '資料', '或', '協助', '您', '安排', '行程', '等', '。', '請', '告訴', '我', '如何', '能', '幫助', '您', '。']\n```\n\n\n## Citation\n\n```\n@article{MediaTek-Research2024breeze7b,\n title={Breeze-7B Technical Report}, \n author={Chan-Jan Hsu and Chang-Le Liu and Feng-Ting Liao and Po-Chun Hsu and Yi-Chang Chen and Da-Shan Shiu},\n year={2024},\n eprint={2403.02712},\n archivePrefix={arXiv},\n primaryClass={cs.CL}\n}\n```\n\n\n",
"related_quantizations": []
},
"tags": [
"gguf",
"arxiv:2403.02712",
"endpoints_compatible",
"region:us"
],
"likes": 0,
"downloads": 521,
"gated": false,
"private": false,
"last_modified": "2024-09-18T16:45:46.000Z",
"created_at": "2024-09-18T11:15:37.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "66eab659bcfa5271dcd4fc54",
"id": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
"modelId": "RichardErkhov/MediaTek-Research_-_Breeze-7B-32k-Base-v1_0-gguf",
"sha": "0ab9dc3082b94df989969ea8246bc300380c47ae",
"createdAt": "2024-09-18T11:15:37.000Z",
"lastModified": "2024-09-18T16:45:46.000Z",
"author": "RichardErkhov",
"downloads": 521,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 24
}