Model Intelligence Sheet

richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf overview

This model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase. https://arxiv.org/abs/2402.00786 For best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below: corresponding to:

ggufarxiv:2402.00786endpoints_compatibleregion:usconversational

richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf visual

Downloads

353

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

19 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
CroissantLLMChat-v0.1.IQ4_NL.gguf	GGUF	IQ4_NL	744.96 MB	Download
CroissantLLMChat-v0.1.IQ4_XS.gguf	GGUF	IQ4_XS	714.88 MB	Download
CroissantLLMChat-v0.1.Q2_K.gguf	GGUF	Q2_K	532.82 MB	Download
CroissantLLMChat-v0.1.Q3_K.gguf	GGUF	Q3_K	670.50 MB	Download
CroissantLLMChat-v0.1.Q3_K_L.gguf	GGUF	Q3_K_L	708.95 MB	Download
CroissantLLMChat-v0.1.Q3_K_M.gguf	GGUF	Q3_K_M	670.50 MB	Download
CroissantLLMChat-v0.1.Q3_K_S.gguf	GGUF	Q3_K_S	611.08 MB	Download
CroissantLLMChat-v0.1.Q4_0.gguf	GGUF	—	738.91 MB	Download
CroissantLLMChat-v0.1.Q4_1.gguf	GGUF	—	815.19 MB	Download
CroissantLLMChat-v0.1.Q4_K.gguf	GGUF	Q4_K	831.91 MB	Download
CroissantLLMChat-v0.1.Q4_K_M.gguf	GGUF	Q4_K_M	831.91 MB	Download
CroissantLLMChat-v0.1.Q4_K_S.gguf	GGUF	Q4_K_S	775.17 MB	Download
CroissantLLMChat-v0.1.Q5_0.gguf	GGUF	—	891.47 MB	Download
CroissantLLMChat-v0.1.Q5_1.gguf	GGUF	—	967.75 MB	Download
CroissantLLMChat-v0.1.Q5_K.gguf	GGUF	Q5_K	954.28 MB	Download
CroissantLLMChat-v0.1.Q5_K_M.gguf	GGUF	Q5_K_M	954.28 MB	Download
CroissantLLMChat-v0.1.Q5_K_S.gguf	GGUF	Q5_K_S	907.60 MB	Download
CroissantLLMChat-v0.1.Q6_K.gguf	GGUF	Q6_K	1.09 GB	Download
CroissantLLMChat-v0.1.Q8_0.gguf	GGUF	—	1.33 GB	Download

Model Details Live

Model Slug

richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf

Author

RichardErkhov

Pipeline Task

—

Library

—

Created

2024-11-04

Last Modified

2024-11-04

Gated

Private

HF SHA

99835c290cb9fbe4b9c99ac7858643913232b990

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "This model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase. https://arxiv.org/abs/2402.00786 For best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below: ``python chat = [ {\"role\": \"user\", \"content\": \"Que puis-je faire à Marseille en hiver?\"}, ] chat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) ` corresponding to: `python chat_input = \"\"\"user {USER QUERY} assistant\\n\"\"\" ``",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nCroissantLLMChat-v0.1 - GGUF\n- Model creator: https://huggingface.co/croissantllm/\n- Original model: https://huggingface.co/croissantllm/CroissantLLMChat-v0.1/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [CroissantLLMChat-v0.1.Q2_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q2_K.gguf) | Q2_K | 0.52GB |\n| [CroissantLLMChat-v0.1.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_S.gguf) | Q3_K_S | 0.6GB |\n| [CroissantLLMChat-v0.1.Q3_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K.gguf) | Q3_K | 0.65GB |\n| [CroissantLLMChat-v0.1.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_M.gguf) | Q3_K_M | 0.65GB |\n| [CroissantLLMChat-v0.1.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_L.gguf) | Q3_K_L | 0.69GB |\n| [CroissantLLMChat-v0.1.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.IQ4_XS.gguf) | IQ4_XS | 0.7GB |\n| [CroissantLLMChat-v0.1.Q4_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_0.gguf) | Q4_0 | 0.72GB |\n| [CroissantLLMChat-v0.1.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.IQ4_NL.gguf) | IQ4_NL | 0.73GB |\n| [CroissantLLMChat-v0.1.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K_S.gguf) | Q4_K_S | 0.76GB |\n| [CroissantLLMChat-v0.1.Q4_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K.gguf) | Q4_K | 0.81GB |\n| [CroissantLLMChat-v0.1.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K_M.gguf) | Q4_K_M | 0.81GB |\n| [CroissantLLMChat-v0.1.Q4_1.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_1.gguf) | Q4_1 | 0.8GB |\n| [CroissantLLMChat-v0.1.Q5_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_0.gguf) | Q5_0 | 0.87GB |\n| [CroissantLLMChat-v0.1.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K_S.gguf) | Q5_K_S | 0.89GB |\n| [CroissantLLMChat-v0.1.Q5_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K.gguf) | Q5_K | 0.93GB |\n| [CroissantLLMChat-v0.1.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K_M.gguf) | Q5_K_M | 0.93GB |\n| [CroissantLLMChat-v0.1.Q5_1.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_1.gguf) | Q5_1 | 0.95GB |\n| [CroissantLLMChat-v0.1.Q6_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q6_K.gguf) | Q6_K | 1.09GB |\n| [CroissantLLMChat-v0.1.Q8_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q8_0.gguf) | Q8_0 | 1.33GB |\n\n\n\n\nOriginal model description:\n---\nlicense: mit\ndatasets:\n- croissantllm/croissant_dataset\n- croissantllm/CroissantLLM-2201-sft\n- cerebras/SlimPajama-627B\n- uonlp/CulturaX\n- pg19\n- bigcode/starcoderdata\nlanguage:\n- fr\n- en\npipeline_tag: text-generation\ntags:\n- legal\n- code\n- text-generation-inference\n- art\n---\n\n# CroissantLLMChat (190k steps + Chat)\n\nThis model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase.\n\nhttps://arxiv.org/abs/2402.00786\n\nFor best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below:\n\n```python\nchat = [\n   {\"role\": \"user\", \"content\": \"Que puis-je faire à Marseille en hiver?\"},\n]\n\nchat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)\n```\n\ncorresponding to:\n\n```python\nchat_input = \"\"\"<|im_start|>user\n{USER QUERY}<|im_end|>\n<|im_start|>assistant\\n\"\"\"\n```\n\n\n## Abstract\nWe introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.\nTo that end, we pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio, a custom tokenizer, and bilingual finetuning datasets. We release the training dataset, notably containing a French split with manually curated, high-quality, and varied data sources.\nTo assess performance outside of English, we craft a novel benchmark, FrenchBench, consisting of an array of classification and generation tasks, covering various orthogonal aspects of model performance in the French Language. Additionally, rooted in transparency and to foster further Large Language Model research, we release codebases, and dozens of checkpoints across various model sizes, training data distributions, and training steps, as well as fine-tuned Chat models, and strong translation models. We evaluate our model through the FMTI framework, and validate 81% of the transparency criteria, far beyond the scores of even most open initiatives.\nThis work enriches the NLP landscape, breaking away from previous English-centric work in order to strengthen our understanding of multilinguality in language models.\n\n## Citation\n\nOur work can be cited as:\n\n```bash\n@misc{faysse2024croissantllm,\n      title={CroissantLLM: A Truly Bilingual French-English Language Model}, \n      author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo},\n      year={2024},\n      eprint={2402.00786},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n## Usage\n\nThis model is a Chat model, that is, it is finetuned for Chat function and works best with the provided template.\n\n\n#### With generate\n\nThis might require a stopping criteria on <|im_end|> token.\n\n```python\nimport torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\n\nmodel_name = \"croissantllm/CroissantLLMChat-v0.1\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(model_name)\n\n\ngeneration_args = {\n    \"max_new_tokens\": 256,\n    \"do_sample\": True,\n    \"temperature\": 0.3,\n    \"top_p\": 0.90,\n    \"top_k\": 40,\n    \"repetition_penalty\": 1.05,\n    \"eos_token_id\": [tokenizer.eos_token_id, 32000],\n}\n\nchat = [\n   {\"role\": \"user\", \"content\": \"Qui est le président francais actuel ?\"},\n]\n\nchat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)\n\ninputs = tokenizer(chat_input, return_tensors=\"pt\").to(model.device)\ntokens = model.generate(**inputs, **generation_args)\n\nprint(tokenizer.decode(tokens[0]))\n# print tokens individually\nprint([(tokenizer.decode([tok]), tok) for tok in tokens[0].tolist()])\n```\n\n\n## Model limitations\n\nEvaluation results indicate the model is strong in its size category, and offers decent performances on writing-based tasks and internal knowledge, and very strong performance on translation tasks. The small size of the CroissantLLM model however hinders its capacity to perform more complex reasoning-based tasks, at least in a zero or few-shot manner in its generalist base or chat-model versions. This is aligned with other models of size and underlines the importance of scale for more abstract tasks.\n\n#### Knowledge Cutoff \nThe model training dataset has a data cutoff date corresponding to the November 2023 Wikipedia dump. This is the de facto knowledge cutoff date for our base model, although a lot of information dates back further. Updated versions can be trained through continued pre-training or subsequent fine-tuning.\n\n#### Multilingual performance.\nCroissantLLM is mostly a French and English model. Code performance is relatively limited, and although some amount of data from other languages is included within the SlimPajama training set, out-of-the-box performance in other languages is not to be expected, although some European languages do work quite well. \n\n#### Hallucinations.\nCroissantLLM can hallucinate and output factually incorrect data, especially regarding complex topics. This is to be expected given the small model size, and hallucination rates seem inferior to most models of the same size category although no quantitative assessments have been conducted outside of MT-Bench experiments.\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2402.00786",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 353,
  "gated": false,
  "private": false,
  "last_modified": "2024-11-04T18:18:01.000Z",
  "created_at": "2024-11-04T17:45:25.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "67290835374c4a21075e8390",
  "id": "RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf",
  "modelId": "RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf",
  "sha": "99835c290cb9fbe4b9c99ac7858643913232b990",
  "createdAt": "2024-11-04T17:45:25.000Z",
  "lastModified": "2024-11-04T18:18:01.000Z",
  "author": "RichardErkhov",
  "downloads": 353,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 21
}