afrideva/aira-2-1b1-gguf Q2_K GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

afrideva/aira-2-1b1-gguf overview

Aira-2 is the second version of the Aira instruction-tuned series. Aira-2-1B1 is an instruction-tuned GPT-style model based on TinyLlama-1.1B. The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc). Check our gradio-demo in Spaces.

transformersggufalignmentinstruction tunedtext generationconversationassistantggmlquantizedq2_kq3_k_mq4_k_mq5_k_mq6_kq8_0text-generationendataset:nicholasKluge/instruct-aira-datasetarxiv:1803.05457arxiv:2109.07958arxiv:2203.09509base_model:nicholasKluge/Aira-2-1B1base_model:quantized:nicholasKluge/Aira-2-1B1license:apache-2.0co2_eq_emissionsregion:us

Downloads

147

Likes

Pipeline

text-generation

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

7 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
aira-2-1b1.fp16.gguf	GGUF	—	2.05 GB	Download
aira-2-1b1.q2_k.gguf	GGUF	Q2_K	459.82 MB	Download
aira-2-1b1.q3_k_m.gguf	GGUF	Q3_K_M	524.38 MB	Download
aira-2-1b1.q4_k_m.gguf	GGUF	Q4_K_M	636.89 MB	Download
aira-2-1b1.q5_k_m.gguf	GGUF	Q5_K_M	745.83 MB	Download
aira-2-1b1.q6_k.gguf	GGUF	Q6_K	861.57 MB	Download
aira-2-1b1.q8_0.gguf	GGUF	—	1.09 GB	Download

Model Details Live

Model Slug

afrideva/aira-2-1b1-gguf

Author

afrideva

Pipeline Task

text-generation

Library

transformers

Created

2023-12-02

Last Modified

2023-12-02

Gated

Private

HF SHA

cb54213d346139928a22e89c3d497405dc905a09

License

apache-2.0

Language

Base Model

nicholasKluge/Aira-2-1B1

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "nicholasKluge/Aira-2-1B1",
    "co2_eq_emissions": {
      "emissions": 1.78,
      "geographical_location": "United States of America",
      "hardware_used": "NVIDIA A100-SXM4-40GB",
      "source": "CodeCarbon",
      "training_type": "fine-tuning"
    },
    "datasets": [
      "nicholasKluge/instruct-aira-dataset"
    ],
    "inference": false,
    "language": [
      "en"
    ],
    "library_name": "transformers",
    "license": "apache-2.0",
    "metrics": [
      "accuracy"
    ],
    "model_creator": "nicholasKluge",
    "model_name": "Aira-2-1B1",
    "pipeline_tag": "text-generation",
    "quantized_by": "afrideva",
    "tags": [
      "alignment",
      "instruction tuned",
      "text generation",
      "conversation",
      "assistant",
      "gguf",
      "ggml",
      "quantized",
      "q2_k",
      "q3_k_m",
      "q4_k_m",
      "q5_k_m",
      "q6_k",
      "q8_0"
    ],
    "widget": [
      {
        "example_title": "Greetings",
        "text": "<|startofinstruction|>How should I call you?<|endofinstruction|>"
      },
      {
        "example_title": "Machine Learning",
        "text": "<|startofinstruction|>Can you explain what is Machine Learning?<|endofinstruction|>"
      },
      {
        "example_title": "Ethics",
        "text": "<|startofinstruction|>Do you know anything about virtue ethics?<|endofinstruction|>"
      },
      {
        "example_title": "Advise",
        "text": "<|startofinstruction|>How can I make my girlfriend happy?<|endofinstruction|>"
      }
    ],
    "frontmatter": {
      "base_model": "nicholasKluge/Aira-2-1B1",
      "co2_eq_emissions": [],
      "datasets": [
        "nicholasKluge/instruct-aira-dataset"
      ],
      "inference": "false",
      "language": [
        "en"
      ],
      "library_name": "transformers",
      "license": "apache-2.0",
      "metrics": [
        "accuracy"
      ],
      "model_creator": "nicholasKluge",
      "model_name": "Aira-2-1B1",
      "pipeline_tag": "text-generation",
      "quantized_by": "afrideva",
      "tags": [
        "alignment",
        "instruction tuned",
        "text generation",
        "conversation",
        "assistant",
        "gguf",
        "ggml",
        "quantized",
        "q2_k",
        "q3_k_m",
        "q4_k_m",
        "q5_k_m",
        "q6_k",
        "q8_0"
      ],
      "widget": [
        "example_title: Greetings",
        "example_title: Machine Learning",
        "example_title: Ethics",
        "example_title: Advise"
      ]
    },
    "hero_image_url": "",
    "summary": "Aira-2 is the second version of the Aira instruction-tuned series. Aira-2-1B1 is an instruction-tuned GPT-style model based on TinyLlama-1.1B. The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc). Check our gradio-demo in Spaces.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nbase_model: nicholasKluge/Aira-2-1B1\nco2_eq_emissions:\n  emissions: 1.78\n  geographical_location: United States of America\n  hardware_used: NVIDIA A100-SXM4-40GB\n  source: CodeCarbon\n  training_type: fine-tuning\ndatasets:\n- nicholasKluge/instruct-aira-dataset\ninference: false\nlanguage:\n- en\nlibrary_name: transformers\nlicense: apache-2.0\nmetrics:\n- accuracy\nmodel_creator: nicholasKluge\nmodel_name: Aira-2-1B1\npipeline_tag: text-generation\nquantized_by: afrideva\ntags:\n- alignment\n- instruction tuned\n- text generation\n- conversation\n- assistant\n- gguf\n- ggml\n- quantized\n- q2_k\n- q3_k_m\n- q4_k_m\n- q5_k_m\n- q6_k\n- q8_0\nwidget:\n- example_title: Greetings\n  text: <|startofinstruction|>How should I call you?<|endofinstruction|>\n- example_title: Machine Learning\n  text: <|startofinstruction|>Can you explain what is Machine Learning?<|endofinstruction|>\n- example_title: Ethics\n  text: <|startofinstruction|>Do you know anything about virtue ethics?<|endofinstruction|>\n- example_title: Advise\n  text: <|startofinstruction|>How can I make my girlfriend happy?<|endofinstruction|>\n---\n# nicholasKluge/Aira-2-1B1-GGUF\n\nQuantized GGUF model files for [Aira-2-1B1](https://huggingface.co/nicholasKluge/Aira-2-1B1) from [nicholasKluge](https://huggingface.co/nicholasKluge)\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [aira-2-1b1.fp16.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.fp16.gguf) | fp16 | 2.20 GB  |\n| [aira-2-1b1.q2_k.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q2_k.gguf) | q2_k | 482.15 MB  |\n| [aira-2-1b1.q3_k_m.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q3_k_m.gguf) | q3_k_m | 549.86 MB  |\n| [aira-2-1b1.q4_k_m.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q4_k_m.gguf) | q4_k_m | 667.83 MB  |\n| [aira-2-1b1.q5_k_m.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q5_k_m.gguf) | q5_k_m | 782.06 MB  |\n| [aira-2-1b1.q6_k.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q6_k.gguf) | q6_k | 903.43 MB  |\n| [aira-2-1b1.q8_0.gguf](https://huggingface.co/afrideva/Aira-2-1B1-GGUF/resolve/main/aira-2-1b1.q8_0.gguf) | q8_0 | 1.17 GB  |\n\n\n\n## Original Model Card:\n# Aira-2-1B1\n\n`Aira-2` is the second version of the Aira instruction-tuned series. `Aira-2-1B1` is an instruction-tuned GPT-style model based on [TinyLlama-1.1B](https://huggingface.co/PY007/TinyLlama-1.1B-intermediate-step-480k-1T). The model was trained with a dataset composed of prompts and completions generated synthetically by prompting already-tuned models (ChatGPT, Llama, Open-Assistant, etc).\n\nCheck our gradio-demo in [Spaces](https://huggingface.co/spaces/nicholasKluge/Aira-Demo).\n\n## Details\n\n- **Size:** 1,261,545,472 parameters\n- **Dataset:** [Instruct-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset)\n- **Language:** English\n- **Number of Epochs:** 3\n- **Batch size:** 4\n- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)\n- **GPU:** 1 NVIDIA A100-SXM4-40GB\n- **Emissions:** 1.78 KgCO2 (Singapore)\n- **Total Energy Consumption:** 3.64 kWh\n\nThis repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.\n\n## Usage\n\nThree special tokens are used to mark the user side of the interaction and the model's response:\n\n`<|startofinstruction|>`What is a language model?`<|endofinstruction|>`A language model is a probability distribution over a vocabulary.`<|endofcompletion|>`\n\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\ntokenizer = AutoTokenizer.from_pretrained('nicholasKluge/Aira-2-1B1')\naira = AutoModelForCausalLM.from_pretrained('nicholasKluge/Aira-2-1B1')\n\naira.eval()\naira.to(device)\n\nquestion =  input(\"Enter your question: \")\n\ninputs = tokenizer(tokenizer.bos_token + question + tokenizer.sep_token, return_tensors=\"pt\").to(device)\n\nresponses = aira.generate(**inputs,\n\tbos_token_id=tokenizer.bos_token_id,\n\tpad_token_id=tokenizer.pad_token_id,\n\teos_token_id=tokenizer.eos_token_id,\n\tdo_sample=True,\n\ttop_k=50,\n\tmax_length=500,\n\ttop_p=0.95,\n\ttemperature=0.7,\n\tnum_return_sequences=2)\n\nprint(f\"Question: 👤 {question}\\n\")\n\nfor i, response in  enumerate(responses):\n\tprint(f'Response {i+1}: 🤖 {tokenizer.decode(response, skip_special_tokens=True).replace(question, \"\")}')\n```\n\nThe model will output something like:\n\n```markdown\n>>>Question: 👤 What is the capital of Brazil?\n\n>>>Response 1: 🤖 The capital of Brazil is Brasília.\n>>>Response 2: 🤖 The capital of Brazil is Brasília.\n```\n\n## Limitations\n\n🤥 Generative models can perpetuate the generation of pseudo-informative content, that is, false information that may appear truthful.\n\n🤬 In certain types of tasks, generative models can produce harmful and discriminatory content inspired by historical stereotypes.\n\n## Evaluation\n\n| Model (TinyLlama)                                             | Average   | [ARC](https://arxiv.org/abs/1803.05457) | [TruthfulQA](https://arxiv.org/abs/2109.07958) | [ToxiGen](https://arxiv.org/abs/2203.09509) |\n|---------------------------------------------------------------|-----------|-----------------------------------------|------------------------------------------------|---------------------------------------------|\n| [Aira-2-1B1](https://huggingface.co/nicholasKluge/Aira-2-1B1) | **42.55** | 25.26                                   | **50.81**                                      | **51.59**                                   |\n| TinyLlama-1.1B-intermediate-step-480k-1T                      | 37.52     | **30.89**                               | 39.55                                          | 42.13                                       |\n\n\n* Evaluations were performed using the [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) (by [EleutherAI](https://www.eleuther.ai/)).\n\n## Cite as 🤗\n\n```latex\n\n@misc{nicholas22aira,\n  doi = {10.5281/zenodo.6989727},\n  url = {https://huggingface.co/nicholasKluge/Aira-2-1B1},\n  author = {Nicholas Kluge Corrêa},\n  title = {Aira},\n  year = {2023},\n  publisher = {HuggingFace},\n  journal = {HuggingFace repository},\n}\n\n```\n\n## License\n\nThe `Aira-2-1B1` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.\n\n# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)\nDetailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_nicholasKluge__Aira-2-1B1)\n\n| Metric                | Value                     |\n|-----------------------|---------------------------|\n| Avg.                  | 25.19   |\n| ARC (25-shot)         | 23.21          |\n| HellaSwag (10-shot)   | 26.97    |\n| MMLU (5-shot)         | 24.86         |\n| TruthfulQA (0-shot)   | 50.63   |\n| Winogrande (5-shot)   | 50.28   |\n| GSM8K (5-shot)        | 0.0        |\n| DROP (3-shot)         | 0.39         |",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "alignment",
    "instruction tuned",
    "text generation",
    "conversation",
    "assistant",
    "ggml",
    "quantized",
    "q2_k",
    "q3_k_m",
    "q4_k_m",
    "q5_k_m",
    "q6_k",
    "q8_0",
    "text-generation",
    "en",
    "dataset:nicholasKluge/instruct-aira-dataset",
    "arxiv:1803.05457",
    "arxiv:2109.07958",
    "arxiv:2203.09509",
    "base_model:nicholasKluge/Aira-2-1B1",
    "base_model:quantized:nicholasKluge/Aira-2-1B1",
    "license:apache-2.0",
    "co2_eq_emissions",
    "region:us"
  ],
  "likes": 0,
  "downloads": 147,
  "gated": false,
  "private": false,
  "last_modified": "2023-12-02T01:04:47.000Z",
  "created_at": "2023-12-02T01:00:43.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "656a81bbb9fa60e33d9ca186",
  "id": "afrideva/Aira-2-1B1-GGUF",
  "modelId": "afrideva/Aira-2-1B1-GGUF",
  "sha": "cb54213d346139928a22e89c3d497405dc905a09",
  "createdAt": "2023-12-02T01:00:43.000Z",
  "lastModified": "2023-12-02T01:04:47.000Z",
  "author": "afrideva",
  "downloads": 147,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 9
}

afrideva/aira-2-1b1-gguf overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard