GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

richarderkhov/nicholaskluge_-_aira-2-124m-gguf overview

Aira-2 is the second version of the Aira instruction-tuned series. Aira-2-124M is an instruction-tuned model based on GPT-2. The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc). Check our gradio-demo in Spaces.

ggufarxiv:1803.05457arxiv:2109.07958arxiv:2203.09509endpoints_compatibleregion:us
richarderkhov/nicholaskluge_-_aira-2-124m-gguf visual
Downloads
84
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

22 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Aira-2-124M.IQ3_M.gguf GGUF IQ3_M 89.86 MB Download
Aira-2-124M.IQ3_S.gguf GGUF IQ3_S 85.98 MB Download
Aira-2-124M.IQ3_XS.gguf GGUF IQ3_XS 85.03 MB Download
Aira-2-124M.IQ4_NL.gguf GGUF IQ4_NL 101.90 MB Download
Aira-2-124M.IQ4_XS.gguf GGUF IQ4_XS 98.29 MB Download
Aira-2-124M.Q2_K.gguf GGUF Q2_K 77.44 MB Download
Aira-2-124M.Q3_K.gguf GGUF Q3_K 93.15 MB Download
Aira-2-124M.Q3_K_L.gguf GGUF Q3_K_L 97.37 MB Download
Aira-2-124M.Q3_K_M.gguf GGUF Q3_K_M 93.15 MB Download
Aira-2-124M.Q3_K_S.gguf GGUF Q3_K_S 85.98 MB Download
Aira-2-124M.Q4_0.gguf GGUF 101.62 MB Download
Aira-2-124M.Q4_1.gguf GGUF 108.99 MB Download
Aira-2-124M.Q4_K.gguf GGUF Q4_K 107.63 MB Download
Aira-2-124M.Q4_K_M.gguf GGUF Q4_K_M 107.63 MB Download
Aira-2-124M.Q4_K_S.gguf GGUF Q4_K_S 101.90 MB Download
Aira-2-124M.Q5_0.gguf GGUF 116.35 MB Download
Aira-2-124M.Q5_1.gguf GGUF 123.71 MB Download
Aira-2-124M.Q5_K.gguf GGUF Q5_K 120.83 MB Download
Aira-2-124M.Q5_K_M.gguf GGUF Q5_K_M 120.83 MB Download
Aira-2-124M.Q5_K_S.gguf GGUF Q5_K_S 116.35 MB Download
Aira-2-124M.Q6_K.gguf GGUF Q6_K 132.00 MB Download
Aira-2-124M.Q8_0.gguf GGUF 169.44 MB Download

Model Details Live

Model Slug
richarderkhov/nicholaskluge_-_aira-2-124m-gguf
Author
RichardErkhov
Pipeline Task
Library
Created
2024-07-20
Last Modified
2024-07-20
Gated
No
Private
No
HF SHA
4ecf8b7c8a6cccfdb99b8b68b5d8ec9d76fa25fe
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "Aira-2 is the second version of the Aira instruction-tuned series. Aira-2-124M is an instruction-tuned model based on GPT-2. The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc). Check our gradio-demo in Spaces.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nAira-2-124M - GGUF\n- Model creator: https://huggingface.co/nicholasKluge/\n- Original model: https://huggingface.co/nicholasKluge/Aira-2-124M/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Aira-2-124M.Q2_K.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q2_K.gguf) | Q2_K | 0.08GB |\n| [Aira-2-124M.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.IQ3_XS.gguf) | IQ3_XS | 0.08GB |\n| [Aira-2-124M.IQ3_S.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.IQ3_S.gguf) | IQ3_S | 0.08GB |\n| [Aira-2-124M.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q3_K_S.gguf) | Q3_K_S | 0.08GB |\n| [Aira-2-124M.IQ3_M.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.IQ3_M.gguf) | IQ3_M | 0.09GB |\n| [Aira-2-124M.Q3_K.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q3_K.gguf) | Q3_K | 0.09GB |\n| [Aira-2-124M.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q3_K_M.gguf) | Q3_K_M | 0.09GB |\n| [Aira-2-124M.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q3_K_L.gguf) | Q3_K_L | 0.1GB |\n| [Aira-2-124M.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.IQ4_XS.gguf) | IQ4_XS | 0.1GB |\n| [Aira-2-124M.Q4_0.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q4_0.gguf) | Q4_0 | 0.1GB |\n| [Aira-2-124M.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.IQ4_NL.gguf) | IQ4_NL | 0.1GB |\n| [Aira-2-124M.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q4_K_S.gguf) | Q4_K_S | 0.1GB |\n| [Aira-2-124M.Q4_K.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q4_K.gguf) | Q4_K | 0.11GB |\n| [Aira-2-124M.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q4_K_M.gguf) | Q4_K_M | 0.11GB |\n| [Aira-2-124M.Q4_1.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q4_1.gguf) | Q4_1 | 0.11GB |\n| [Aira-2-124M.Q5_0.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q5_0.gguf) | Q5_0 | 0.11GB |\n| [Aira-2-124M.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q5_K_S.gguf) | Q5_K_S | 0.11GB |\n| [Aira-2-124M.Q5_K.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q5_K.gguf) | Q5_K | 0.12GB |\n| [Aira-2-124M.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q5_K_M.gguf) | Q5_K_M | 0.12GB |\n| [Aira-2-124M.Q5_1.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q5_1.gguf) | Q5_1 | 0.12GB |\n| [Aira-2-124M.Q6_K.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q6_K.gguf) | Q6_K | 0.13GB |\n| [Aira-2-124M.Q8_0.gguf](https://huggingface.co/RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf/blob/main/Aira-2-124M.Q8_0.gguf) | Q8_0 | 0.17GB |\n\n\n\n\nOriginal model description:\n---\nlicense: apache-2.0\ndatasets:\n- nicholasKluge/instruct-aira-dataset\nlanguage:\n- en\nmetrics:\n- accuracy\nlibrary_name: transformers\ntags:\n- alignment\n- instruction tuned\n- text generation\n- conversation\n- assistant\npipeline_tag: text-generation\nwidget:\n- text: \"<|startofinstruction|>Can you explain what is Machine Learning?<|endofinstruction|>\"\n  example_title: Machine Learning\n- text: \"<|startofinstruction|>Do you know anything about virtue ethics?<|endofinstruction|>\"\n  example_title: Ethics\n- text: \"<|startofinstruction|>How can I make my girlfriend happy?<|endofinstruction|>\"\n  example_title: Advise\ninference:\n  parameters:\n    repetition_penalty: 1.2\n    temperature: 0.2\n    top_k: 30\n    top_p: 0.3\n    max_new_tokens: 200\n    length_penalty: 0.3\n    early_stopping: true\nco2_eq_emissions:\n  emissions: 250\n  source: CodeCarbon\n  training_type: fine-tuning\n  geographical_location: United States of America\n  hardware_used: NVIDIA A100-SXM4-40GB\n---\n# Aira-2-124M\n\nAira-2 is the second version of the Aira instruction-tuned series. Aira-2-124M is an instruction-tuned model based on [GPT-2](https://huggingface.co/gpt2). The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc).\n\nCheck our gradio-demo in [Spaces](https://huggingface.co/spaces/nicholasKluge/Aira-Demo).\n\n## Details\n\n- **Size:** 124,441,344 parameters\n- **Dataset:** [Instruct-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset)\n- **Language:** English\n- **Number of Epochs:** 5\n- **Batch size:** 32\n- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)\n- **GPU:** 1 NVIDIA A100-SXM4-40GB\n- **Emissions:** 0.25 KgCO2 (Singapore)\n- **Total Energy Consumption:** 0.52 kWh\n\nThis repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.\n\n## Usage\n\nThree special tokens are used to mark the user side of the interaction and the model's response:\n\n`<|startofinstruction|>`What is a language model?`<|endofinstruction|>`A language model is a probability distribution over a vocabulary.`<|endofcompletion|>`\n\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\ntokenizer = AutoTokenizer.from_pretrained('nicholasKluge/Aira-2-124M')\naira = AutoModelForCausalLM.from_pretrained('nicholasKluge/Aira-2-124M')\n\naira.eval()\naira.to(device)\n\nquestion =  input(\"Enter your question: \")\n\ninputs = tokenizer(tokenizer.bos_token + question + tokenizer.sep_token,\n  add_special_tokens=False,\n  return_tensors=\"pt\").to(device)\n\nresponses = aira.generate(**inputs,\tnum_return_sequences=2)\n\nprint(f\"Question: 👤 {question}\\n\")\n\nfor i, response in  enumerate(responses):\n\tprint(f'Response {i+1}: 🤖 {tokenizer.decode(response, skip_special_tokens=True).replace(question, \"\")}')\n```\n\nThe model will output something like:\n\n```markdown\n>>>Question: 👤 What is the capital of Brazil?\n\n>>>Response 1: 🤖 The capital of Brazil is Brasília.\n>>>Response 2: 🤖 The capital of Brazil is Brasília.\n```\n\n## Limitations\n\n- **Hallucinations:** This model can produce content that can be mistaken for truth but is, in fact, misleading or entirely false, i.e., hallucination.\n\n- **Biases and Toxicity:** This model inherits the social and historical stereotypes from the data used to train it. Given these biases, the model can produce toxic content, i.e., harmful, offensive, or detrimental to individuals, groups, or communities.\n\n- **Repetition and Verbosity:** The model may get stuck on repetition loops (especially if the repetition penalty during generations is set to a meager value) or produce verbose responses unrelated to the prompt it was given.\n\n## Evaluation\n\n|Model                                                                   |Average   |[ARC](https://arxiv.org/abs/1803.05457) |[TruthfulQA](https://arxiv.org/abs/2109.07958) |[ToxiGen](https://arxiv.org/abs/2203.09509) |\n| ---------------------------------------------------------------------- | -------- | -------------------------------------- | --------------------------------------------- | ------------------------------------------ | \n|[Aira-2-124M-DPO](https://huggingface.co/nicholasKluge/Aira-2-124M-DPO) |**40.68** |**24.66**                               |**42.61**                                      |**54.79**                                   |\n|[Aira-2-124M](https://huggingface.co/nicholasKluge/Aira-2-124M)         |38.07     |24.57                                   |41.02                                          |48.62                                       |\n|GPT-2                                                                   |35.37     |21.84                                   |40.67                                          |43.62                                       |\n|[Aira-2-355M](https://huggingface.co/nicholasKluge/Aira-2-355M)         |**39.68** |**27.56**                               |38.53                                          |**53.19**                                   |\n|GPT-2-medium                                                            |36.43     |27.05                                   |**40.76**                                      |41.49                                       |\n|[Aira-2-774M](https://huggingface.co/nicholasKluge/Aira-2-774M)         |**42.26** |**28.75**                               |**41.33**                                      |**56.70**                                   |\n|GPT-2-large                                                             |35.16     |25.94                                   |38.71                                          |40.85                                       |\n|[Aira-2-1B5](https://huggingface.co/nicholasKluge/Aira-2-1B5)           |**42.22** |28.92                                   |**41.16**                                      |**56.60**                                   |\n|GPT-2-xl                                                                |36.84     |**30.29**                               |38.54                                          |41.70                                       |\n\n* Evaluations were performed using the [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) (by [EleutherAI](https://www.eleuther.ai/)).\n\n## Cite as 🤗\n\n```latex\n@misc{nicholas22aira,\n  doi = {10.5281/zenodo.6989727},\n  url = {https://github.com/Nkluge-correa/Aira},\n  author = {Nicholas Kluge Corrêa},\n  title = {Aira},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n}\n\n@phdthesis{kluge2024dynamic,\n  title={Dynamic Normativity},\n  author={Kluge Corr{\\^e}a, Nicholas},\n  year={2024},\n  school={Universit{\\\"a}ts-und Landesbibliothek Bonn}\n}\n```\n\n## License\n\nAira-2-124M is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.\n\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:1803.05457",
    "arxiv:2109.07958",
    "arxiv:2203.09509",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 0,
  "downloads": 84,
  "gated": false,
  "private": false,
  "last_modified": "2024-07-20T09:30:39.000Z",
  "created_at": "2024-07-20T09:19:55.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "669b813b990749deca0fc0ab",
  "id": "RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf",
  "modelId": "RichardErkhov/nicholasKluge_-_Aira-2-124M-gguf",
  "sha": "4ecf8b7c8a6cccfdb99b8b68b5d8ec9d76fa25fe",
  "createdAt": "2024-07-20T09:19:55.000Z",
  "lastModified": "2024-07-20T09:30:39.000Z",
  "author": "RichardErkhov",
  "downloads": 84,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}