GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf overview

This model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase. https://arxiv.org/abs/2402.00786 For best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below: corresponding to:

ggufarxiv:2402.00786endpoints_compatibleregion:usconversational
richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf visual
Downloads
353
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

19 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
CroissantLLMChat-v0.1.IQ4_NL.gguf GGUF IQ4_NL 744.96 MB Download
CroissantLLMChat-v0.1.IQ4_XS.gguf GGUF IQ4_XS 714.88 MB Download
CroissantLLMChat-v0.1.Q2_K.gguf GGUF Q2_K 532.82 MB Download
CroissantLLMChat-v0.1.Q3_K.gguf GGUF Q3_K 670.50 MB Download
CroissantLLMChat-v0.1.Q3_K_L.gguf GGUF Q3_K_L 708.95 MB Download
CroissantLLMChat-v0.1.Q3_K_M.gguf GGUF Q3_K_M 670.50 MB Download
CroissantLLMChat-v0.1.Q3_K_S.gguf GGUF Q3_K_S 611.08 MB Download
CroissantLLMChat-v0.1.Q4_0.gguf GGUF 738.91 MB Download
CroissantLLMChat-v0.1.Q4_1.gguf GGUF 815.19 MB Download
CroissantLLMChat-v0.1.Q4_K.gguf GGUF Q4_K 831.91 MB Download
CroissantLLMChat-v0.1.Q4_K_M.gguf GGUF Q4_K_M 831.91 MB Download
CroissantLLMChat-v0.1.Q4_K_S.gguf GGUF Q4_K_S 775.17 MB Download
CroissantLLMChat-v0.1.Q5_0.gguf GGUF 891.47 MB Download
CroissantLLMChat-v0.1.Q5_1.gguf GGUF 967.75 MB Download
CroissantLLMChat-v0.1.Q5_K.gguf GGUF Q5_K 954.28 MB Download
CroissantLLMChat-v0.1.Q5_K_M.gguf GGUF Q5_K_M 954.28 MB Download
CroissantLLMChat-v0.1.Q5_K_S.gguf GGUF Q5_K_S 907.60 MB Download
CroissantLLMChat-v0.1.Q6_K.gguf GGUF Q6_K 1.09 GB Download
CroissantLLMChat-v0.1.Q8_0.gguf GGUF 1.33 GB Download

Model Details Live

Model Slug
richarderkhov/croissantllm_-_croissantllmchat-v0.1-gguf
Author
RichardErkhov
Pipeline Task
Library
Created
2024-11-04
Last Modified
2024-11-04
Gated
No
Private
No
HF SHA
99835c290cb9fbe4b9c99ac7858643913232b990
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "This model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase. https://arxiv.org/abs/2402.00786 For best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below: ``python chat = [ {\"role\": \"user\", \"content\": \"Que puis-je faire à Marseille en hiver?\"}, ] chat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) ` corresponding to: `python chat_input = \"\"\"user {USER QUERY} assistant\\n\"\"\" ``",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nCroissantLLMChat-v0.1 - GGUF\n- Model creator: https://huggingface.co/croissantllm/\n- Original model: https://huggingface.co/croissantllm/CroissantLLMChat-v0.1/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [CroissantLLMChat-v0.1.Q2_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q2_K.gguf) | Q2_K | 0.52GB |\n| [CroissantLLMChat-v0.1.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_S.gguf) | Q3_K_S | 0.6GB |\n| [CroissantLLMChat-v0.1.Q3_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K.gguf) | Q3_K | 0.65GB |\n| [CroissantLLMChat-v0.1.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_M.gguf) | Q3_K_M | 0.65GB |\n| [CroissantLLMChat-v0.1.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q3_K_L.gguf) | Q3_K_L | 0.69GB |\n| [CroissantLLMChat-v0.1.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.IQ4_XS.gguf) | IQ4_XS | 0.7GB |\n| [CroissantLLMChat-v0.1.Q4_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_0.gguf) | Q4_0 | 0.72GB |\n| [CroissantLLMChat-v0.1.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.IQ4_NL.gguf) | IQ4_NL | 0.73GB |\n| [CroissantLLMChat-v0.1.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K_S.gguf) | Q4_K_S | 0.76GB |\n| [CroissantLLMChat-v0.1.Q4_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K.gguf) | Q4_K | 0.81GB |\n| [CroissantLLMChat-v0.1.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_K_M.gguf) | Q4_K_M | 0.81GB |\n| [CroissantLLMChat-v0.1.Q4_1.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q4_1.gguf) | Q4_1 | 0.8GB |\n| [CroissantLLMChat-v0.1.Q5_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_0.gguf) | Q5_0 | 0.87GB |\n| [CroissantLLMChat-v0.1.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K_S.gguf) | Q5_K_S | 0.89GB |\n| [CroissantLLMChat-v0.1.Q5_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K.gguf) | Q5_K | 0.93GB |\n| [CroissantLLMChat-v0.1.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_K_M.gguf) | Q5_K_M | 0.93GB |\n| [CroissantLLMChat-v0.1.Q5_1.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q5_1.gguf) | Q5_1 | 0.95GB |\n| [CroissantLLMChat-v0.1.Q6_K.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q6_K.gguf) | Q6_K | 1.09GB |\n| [CroissantLLMChat-v0.1.Q8_0.gguf](https://huggingface.co/RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf/blob/main/CroissantLLMChat-v0.1.Q8_0.gguf) | Q8_0 | 1.33GB |\n\n\n\n\nOriginal model description:\n---\nlicense: mit\ndatasets:\n- croissantllm/croissant_dataset\n- croissantllm/CroissantLLM-2201-sft\n- cerebras/SlimPajama-627B\n- uonlp/CulturaX\n- pg19\n- bigcode/starcoderdata\nlanguage:\n- fr\n- en\npipeline_tag: text-generation\ntags:\n- legal\n- code\n- text-generation-inference\n- art\n---\n\n# CroissantLLMChat (190k steps + Chat)\n\nThis model is part of the CroissantLLM initiative, and corresponds to the checkpoint after 190k steps (2.99 T) tokens and a final Chat finetuning phase.\n\nhttps://arxiv.org/abs/2402.00786\n\nFor best performance, it should be used with a temperature of 0.3 or more, and with the exact template described below:\n\n```python\nchat = [\n   {\"role\": \"user\", \"content\": \"Que puis-je faire à Marseille en hiver?\"},\n]\n\nchat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)\n```\n\ncorresponding to:\n\n```python\nchat_input = \"\"\"<|im_start|>user\n{USER QUERY}<|im_end|>\n<|im_start|>assistant\\n\"\"\"\n```\n\n\n## Abstract\nWe introduce CroissantLLM, a 1.3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.\nTo that end, we pioneer the approach of training an intrinsically bilingual model with a 1:1 English-to-French pretraining data ratio, a custom tokenizer, and bilingual finetuning datasets. We release the training dataset, notably containing a French split with manually curated, high-quality, and varied data sources.\nTo assess performance outside of English, we craft a novel benchmark, FrenchBench, consisting of an array of classification and generation tasks, covering various orthogonal aspects of model performance in the French Language. Additionally, rooted in transparency and to foster further Large Language Model research, we release codebases, and dozens of checkpoints across various model sizes, training data distributions, and training steps, as well as fine-tuned Chat models, and strong translation models. We evaluate our model through the FMTI framework, and validate 81% of the transparency criteria, far beyond the scores of even most open initiatives.\nThis work enriches the NLP landscape, breaking away from previous English-centric work in order to strengthen our understanding of multilinguality in language models.\n\n## Citation\n\nOur work can be cited as:\n\n```bash\n@misc{faysse2024croissantllm,\n      title={CroissantLLM: A Truly Bilingual French-English Language Model}, \n      author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo},\n      year={2024},\n      eprint={2402.00786},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n## Usage\n\nThis model is a Chat model, that is, it is finetuned for Chat function and works best with the provided template.\n\n\n#### With generate\n\nThis might require a stopping criteria on <|im_end|> token.\n\n```python\nimport torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\n\nmodel_name = \"croissantllm/CroissantLLMChat-v0.1\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(model_name)\n\n\ngeneration_args = {\n    \"max_new_tokens\": 256,\n    \"do_sample\": True,\n    \"temperature\": 0.3,\n    \"top_p\": 0.90,\n    \"top_k\": 40,\n    \"repetition_penalty\": 1.05,\n    \"eos_token_id\": [tokenizer.eos_token_id, 32000],\n}\n\nchat = [\n   {\"role\": \"user\", \"content\": \"Qui est le président francais actuel ?\"},\n]\n\nchat_input = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)\n\ninputs = tokenizer(chat_input, return_tensors=\"pt\").to(model.device)\ntokens = model.generate(**inputs, **generation_args)\n\nprint(tokenizer.decode(tokens[0]))\n# print tokens individually\nprint([(tokenizer.decode([tok]), tok) for tok in tokens[0].tolist()])\n```\n\n\n## Model limitations\n\nEvaluation results indicate the model is strong in its size category, and offers decent performances on writing-based tasks and internal knowledge, and very strong performance on translation tasks. The small size of the CroissantLLM model however hinders its capacity to perform more complex reasoning-based tasks, at least in a zero or few-shot manner in its generalist base or chat-model versions. This is aligned with other models of size and underlines the importance of scale for more abstract tasks.\n\n#### Knowledge Cutoff \nThe model training dataset has a data cutoff date corresponding to the November 2023 Wikipedia dump. This is the de facto knowledge cutoff date for our base model, although a lot of information dates back further. Updated versions can be trained through continued pre-training or subsequent fine-tuning.\n\n#### Multilingual performance.\nCroissantLLM is mostly a French and English model. Code performance is relatively limited, and although some amount of data from other languages is included within the SlimPajama training set, out-of-the-box performance in other languages is not to be expected, although some European languages do work quite well. \n\n#### Hallucinations.\nCroissantLLM can hallucinate and output factually incorrect data, especially regarding complex topics. This is to be expected given the small model size, and hallucination rates seem inferior to most models of the same size category although no quantitative assessments have been conducted outside of MT-Bench experiments.\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2402.00786",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 353,
  "gated": false,
  "private": false,
  "last_modified": "2024-11-04T18:18:01.000Z",
  "created_at": "2024-11-04T17:45:25.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "67290835374c4a21075e8390",
  "id": "RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf",
  "modelId": "RichardErkhov/croissantllm_-_CroissantLLMChat-v0.1-gguf",
  "sha": "99835c290cb9fbe4b9c99ac7858643913232b990",
  "createdAt": "2024-11-04T17:45:25.000Z",
  "lastModified": "2024-11-04T18:18:01.000Z",
  "author": "RichardErkhov",
  "downloads": 353,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 21
}