Model Intelligence Sheet

richarderkhov/rakuten_-_rakutenai-7b-chat-gguf overview

Model Description RakutenAI-7B is a systematic initiative that brings the latest technologies to the world of Japanese LLMs. RakutenAI-7B achieves the best scores on the Japanese language understanding benchmarks while maintaining a competitive performance on the English test sets among similar models such as OpenCalm, Elyza, Youri, Nekomata and Swallow. RakutenAI-7B leverages the Mistral model architecture and is based on Mistral-7B-v0.1 pre-trained checkpoint, exemplifying a successful retrofitting of the pre-trained model weights. Moreover, we extend Mistral's vocabulary from 32k to 48k to offer a better character-per-token rate for Japanese. The technical report can be accessed at arXiv. If you are looking for a foundation model, check RakutenAI-7B. If you are looking for an instruction-tuned model, check RakutenAI-7B-instruct. An independent evaluation by Kamata et.al. for Nejumi LLMリーダーボード Neo using a weighted average of llm-jp-eval and Japanese MT-bench also confirms the highest performance of chat/instruct versions of RakutenAI-7B among Open LLMs of similar sizes, with a score of 0.393/0.331 respectively, as of 22nd March 2024.

ggufarxiv:2403.15484endpoints_compatibleregion:us

richarderkhov/rakuten_-_rakutenai-7b-chat-gguf visual

Downloads

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

22 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
RakutenAI-7B-chat.IQ3_M.gguf	GGUF	IQ3_M	3.14 GB	Download
RakutenAI-7B-chat.IQ3_S.gguf	GGUF	IQ3_S	3.04 GB	Download
RakutenAI-7B-chat.IQ3_XS.gguf	GGUF	IQ3_XS	2.89 GB	Download
RakutenAI-7B-chat.IQ4_NL.gguf	GGUF	IQ4_NL	3.95 GB	Download
RakutenAI-7B-chat.IQ4_XS.gguf	GGUF	IQ4_XS	3.76 GB	Download
RakutenAI-7B-chat.Q2_K.gguf	GGUF	Q2_K	2.60 GB	Download
RakutenAI-7B-chat.Q3_K.gguf	GGUF	Q3_K	3.35 GB	Download
RakutenAI-7B-chat.Q3_K_L.gguf	GGUF	Q3_K_L	3.64 GB	Download
RakutenAI-7B-chat.Q3_K_M.gguf	GGUF	Q3_K_M	3.35 GB	Download
RakutenAI-7B-chat.Q3_K_S.gguf	GGUF	Q3_K_S	3.02 GB	Download
RakutenAI-7B-chat.Q4_0.gguf	GGUF	—	3.91 GB	Download
RakutenAI-7B-chat.Q4_1.gguf	GGUF	—	4.33 GB	Download
RakutenAI-7B-chat.Q4_K.gguf	GGUF	Q4_K	4.15 GB	Download
RakutenAI-7B-chat.Q4_K_M.gguf	GGUF	Q4_K_M	4.15 GB	Download
RakutenAI-7B-chat.Q4_K_S.gguf	GGUF	Q4_K_S	3.94 GB	Download
RakutenAI-7B-chat.Q5_0.gguf	GGUF	—	4.75 GB	Download
RakutenAI-7B-chat.Q5_1.gguf	GGUF	—	5.16 GB	Download
RakutenAI-7B-chat.Q5_K.gguf	GGUF	Q5_K	4.87 GB	Download
RakutenAI-7B-chat.Q5_K_M.gguf	GGUF	Q5_K_M	4.87 GB	Download
RakutenAI-7B-chat.Q5_K_S.gguf	GGUF	Q5_K_S	4.75 GB	Download
RakutenAI-7B-chat.Q6_K.gguf	GGUF	Q6_K	5.63 GB	Download
RakutenAI-7B-chat.Q8_0.gguf	GGUF	—	7.30 GB	Download

Model Details Live

Model Slug

richarderkhov/rakuten_-_rakutenai-7b-chat-gguf

Author

RichardErkhov

Pipeline Task

—

Library

—

Created

2024-06-15

Last Modified

2024-06-15

Gated

Private

HF SHA

88070ae3ebb8b0f753bfcd7c5e9938887ba567b5

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "## Model Description RakutenAI-7B is a systematic initiative that brings the latest technologies to the world of Japanese LLMs. RakutenAI-7B achieves the best scores on the Japanese language understanding benchmarks while maintaining a competitive performance on the English test sets among similar models such as OpenCalm, Elyza, Youri, Nekomata and Swallow. RakutenAI-7B leverages the Mistral model architecture and is based on Mistral-7B-v0.1 pre-trained checkpoint, exemplifying a successful retrofitting of the pre-trained model weights. Moreover, we extend Mistral's vocabulary from 32k to 48k to offer a better character-per-token rate for Japanese. *The technical report can be accessed at arXiv.* *If you are looking for a foundation model, check RakutenAI-7B*. *If you are looking for an instruction-tuned model, check RakutenAI-7B-instruct*. An independent evaluation by Kamata et.al. for Nejumi LLMリーダーボード Neo using a weighted average of llm-jp-eval and Japanese MT-bench also confirms the highest performance of chat/instruct versions of RakutenAI-7B among Open LLMs of similar sizes, with a score of 0.393/0.331 respectively, as of 22nd March 2024.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nRakutenAI-7B-chat - GGUF\n- Model creator: https://huggingface.co/Rakuten/\n- Original model: https://huggingface.co/Rakuten/RakutenAI-7B-chat/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [RakutenAI-7B-chat.Q2_K.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q2_K.gguf) | Q2_K | 2.6GB |\n| [RakutenAI-7B-chat.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.IQ3_XS.gguf) | IQ3_XS | 2.89GB |\n| [RakutenAI-7B-chat.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.IQ3_S.gguf) | IQ3_S | 3.04GB |\n| [RakutenAI-7B-chat.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q3_K_S.gguf) | Q3_K_S | 3.02GB |\n| [RakutenAI-7B-chat.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.IQ3_M.gguf) | IQ3_M | 3.14GB |\n| [RakutenAI-7B-chat.Q3_K.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q3_K.gguf) | Q3_K | 3.35GB |\n| [RakutenAI-7B-chat.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q3_K_M.gguf) | Q3_K_M | 3.35GB |\n| [RakutenAI-7B-chat.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q3_K_L.gguf) | Q3_K_L | 3.64GB |\n| [RakutenAI-7B-chat.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.IQ4_XS.gguf) | IQ4_XS | 3.76GB |\n| [RakutenAI-7B-chat.Q4_0.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q4_0.gguf) | Q4_0 | 3.91GB |\n| [RakutenAI-7B-chat.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.IQ4_NL.gguf) | IQ4_NL | 3.95GB |\n| [RakutenAI-7B-chat.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q4_K_S.gguf) | Q4_K_S | 3.94GB |\n| [RakutenAI-7B-chat.Q4_K.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q4_K.gguf) | Q4_K | 4.15GB |\n| [RakutenAI-7B-chat.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q4_K_M.gguf) | Q4_K_M | 4.15GB |\n| [RakutenAI-7B-chat.Q4_1.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q4_1.gguf) | Q4_1 | 4.33GB |\n| [RakutenAI-7B-chat.Q5_0.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q5_0.gguf) | Q5_0 | 4.75GB |\n| [RakutenAI-7B-chat.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q5_K_S.gguf) | Q5_K_S | 4.75GB |\n| [RakutenAI-7B-chat.Q5_K.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q5_K.gguf) | Q5_K | 4.87GB |\n| [RakutenAI-7B-chat.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q5_K_M.gguf) | Q5_K_M | 4.87GB |\n| [RakutenAI-7B-chat.Q5_1.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q5_1.gguf) | Q5_1 | 5.16GB |\n| [RakutenAI-7B-chat.Q6_K.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q6_K.gguf) | Q6_K | 5.63GB |\n| [RakutenAI-7B-chat.Q8_0.gguf](https://huggingface.co/RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf/blob/main/RakutenAI-7B-chat.Q8_0.gguf) | Q8_0 | 7.3GB |\n\n\n\n\nOriginal model description:\n---\nlicense: apache-2.0\n---\n# RakutenAI-7B-chat\n## Model Description\nRakutenAI-7B is a systematic initiative that brings the latest technologies to the world of Japanese LLMs. RakutenAI-7B achieves the best scores on the Japanese language understanding benchmarks while maintaining a competitive performance on the English test sets among similar models such as OpenCalm, Elyza, Youri, Nekomata and Swallow. RakutenAI-7B leverages the Mistral model architecture and is based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) pre-trained checkpoint, exemplifying a successful retrofitting of the pre-trained model weights. Moreover, we extend Mistral's vocabulary from 32k to 48k to offer a better character-per-token rate for Japanese.\n\n*The technical report can be accessed at [arXiv](https://arxiv.org/abs/2403.15484).*\n\n*If you are looking for a foundation model, check [RakutenAI-7B](https://huggingface.co/Rakuten/RakutenAI-7B)*.\n\n*If you are looking for an instruction-tuned model, check [RakutenAI-7B-instruct](https://huggingface.co/Rakuten/RakutenAI-7B-instruct)*.\n\nAn independent evaluation by Kamata et.al. for [Nejumi LLMリーダーボード Neo](https://wandb.ai/wandb-japan/llm-leaderboard/reports/Nejumi-LLM-Neo--Vmlldzo2MTkyMTU0#総合評価) using a weighted average of [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval) and [Japanese MT-bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge) also confirms the highest performance of chat/instruct versions of RakutenAI-7B among Open LLMs of similar sizes, with a score of 0.393/0.331 respectively, as of 22nd March 2024.\n\n## Usage\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_path = \"Rakuten/RakutenAI-7B-chat\"\ntokenizer = AutoTokenizer.from_pretrained(model_path)\nmodel = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=\"auto\", device_map=\"auto\")\nmodel.eval()\n\nrequests = [\n    \"「馬が合う」はどう言う意味ですか\",\n    \"How to make an authentic Spanish Omelette?\",\n]\n\nsystem_message = \"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {user_input} ASSISTANT:\"\n\nfor req in requests:\n    input_req = system_message.format(user_input=req)\n    input_ids = tokenizer.encode(input_req, return_tensors=\"pt\").to(device=model.device)\n    tokens = model.generate(\n        input_ids,\n        max_new_tokens=1024,\n        do_sample=True,\n        pad_token_id=tokenizer.eos_token_id,\n    )\n    out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)\n    print(\"USER:\\n\" + req)\n    print(\"ASSISTANT:\\n\" + out)\n    print()\n    print()\n```\n\n## Model Details\n\n* **Developed by**: [Rakuten Group, Inc.](https://ai.rakuten.com/)\n* **Language(s)**: Japanese, English\n* **License**: This model is licensed under [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).\n* **Instruction-Tuning Dataset**: We fine-tune our foundation model to create RakutenAI-7B-instruct and RakutenAI-7B-chat using a mix of open source and internally hand-crafted datasets. We use `train` part of the following datasets (CC by-SA License) for instruction-tuned and chat-tuned models:\n    - [JSNLI](https://nlp.ist.i.kyoto-u.ac.jp/?%E6%97%A5%E6%9C%AC%E8%AA%9ESNLI%28JSNLI%29%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88)\n    - [RTE](https://nlp.ist.i.kyoto-u.ac.jp/?Textual+Entailment+%E8%A9%95%E4%BE%A1%E3%83%87%E3%83%BC%E3%82%BF)\n    - [KUCI](https://nlp.ist.i.kyoto-u.ac.jp/?KUCI)\n    - [BELEBELE](https://huggingface.co/datasets/facebook/belebele)\n    - [JCS](https://aclanthology.org/2022.lrec-1.317/)\n    - [JNLI](https://aclanthology.org/2022.lrec-1.317/)\n    - [Dolly-15K](https://huggingface.co/datasets/databricks/databricks-dolly-15k)\n    - [OpenAssistant1](https://huggingface.co/datasets/OpenAssistant/oasst1)\n\n\n### Limitations and Bias\n\nThe suite of RakutenAI-7B models is capable of generating human-like text on a wide range of topics. However, like all LLMs, they have limitations and can produce biased, inaccurate, or unsafe outputs. Please exercise caution and judgement while interacting with them.\n\n## Citation\nFor citing our work on the suite of RakutenAI-7B models, please use: \n\n```\n@misc{rakutengroup2024rakutenai7b,\n      title={RakutenAI-7B: Extending Large Language Models for Japanese}, \n      author={{Rakuten Group, Inc.} and Aaron Levine and Connie Huang and Chenguang Wang and Eduardo Batista and Ewa Szymanska and Hongyi Ding and Hou Wei Chou and Jean-François Pessiot and Johanes Effendi and Justin Chiu and Kai Torben Ohlhus and Karan Chopra and Keiji Shinzato and Koji Murakami and Lee Xiong and Lei Chen and Maki Kubota and Maksim Tkachenko and Miroku Lee and Naoki Takahashi and Prathyusha Jwalapuram and Ryutaro Tatsushima and Saurabh Jain and Sunil Kumar Yadav and Ting Cai and Wei-Te Chen and Yandi Xia and Yuki Nakayama and Yutaka Higashiyama},\n      year={2024},\n      eprint={2403.15484},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2403.15484",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 0,
  "downloads": 90,
  "gated": false,
  "private": false,
  "last_modified": "2024-06-15T12:03:20.000Z",
  "created_at": "2024-06-15T09:37:55.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "666d60f314f1c262fe648c93",
  "id": "RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf",
  "modelId": "RichardErkhov/Rakuten_-_RakutenAI-7B-chat-gguf",
  "sha": "88070ae3ebb8b0f753bfcd7c5e9938887ba567b5",
  "createdAt": "2024-06-15T09:37:55.000Z",
  "lastModified": "2024-06-15T12:03:20.000Z",
  "author": "RichardErkhov",
  "downloads": 90,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}