GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

richarderkhov/ibm_-_powermoe-3b-gguf overview

model = AutoModelForCausalLM.frompretrained(modelpath, devicemap=device) model.eval() # change input text as desired prompt = "Write a code to find the maximum value in a list of numbers." # tokenize the text inputtokens = tokenizer(prompt, returntensors="pt") # transfer tokenized inputs to the device for i in inputtokens: inputtokens[i] = inputtokens[i].to(device) # generate output tokens output = model.generate(inputtokens, maxnewtokens=100) # decode output tokens into text output = tokenizer.batchdecode(output) # loop over the batch to print, in this example the batch size is 1 for i in output: print(i) Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more quants, at much higher speed, than I would otherwise be able to.

ggufarxiv:2408.13359endpoints_compatibleregion:us
richarderkhov/ibm_-_powermoe-3b-gguf visual
Downloads
95
Likes
0
Pipeline
Library
Visibility
Public
Access
Open

Repository Files & Downloads

22 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
PowerMoE-3b.IQ3_M.gguf GGUF IQ3_M 1.41 GB Download
PowerMoE-3b.IQ3_S.gguf GGUF IQ3_S 1.39 GB Download
PowerMoE-3b.IQ3_XS.gguf GGUF IQ3_XS 1.32 GB Download
PowerMoE-3b.IQ4_NL.gguf GGUF IQ4_NL 1.81 GB Download
PowerMoE-3b.IQ4_XS.gguf GGUF IQ4_XS 1.72 GB Download
PowerMoE-3b.Q2_K.gguf GGUF Q2_K 1.18 GB Download
PowerMoE-3b.Q3_K.gguf GGUF Q3_K 1.53 GB Download
PowerMoE-3b.Q3_K_L.gguf GGUF Q3_K_L 1.65 GB Download
PowerMoE-3b.Q3_K_M.gguf GGUF Q3_K_M 1.53 GB Download
PowerMoE-3b.Q3_K_S.gguf GGUF Q3_K_S 1.39 GB Download
PowerMoE-3b.Q4_0.gguf GGUF 1.79 GB Download
PowerMoE-3b.Q4_1.gguf GGUF 1.99 GB Download
PowerMoE-3b.Q4_K.gguf GGUF Q4_K 1.92 GB Download
PowerMoE-3b.Q4_K_M.gguf GGUF Q4_K_M 1.92 GB Download
PowerMoE-3b.Q4_K_S.gguf GGUF Q4_K_S 1.81 GB Download
PowerMoE-3b.Q5_0.gguf GGUF 2.18 GB Download
PowerMoE-3b.Q5_1.gguf GGUF 2.37 GB Download
PowerMoE-3b.Q5_K.gguf GGUF Q5_K 2.24 GB Download
PowerMoE-3b.Q5_K_M.gguf GGUF Q5_K_M 2.24 GB Download
PowerMoE-3b.Q5_K_S.gguf GGUF Q5_K_S 2.18 GB Download
PowerMoE-3b.Q6_K.gguf GGUF Q6_K 2.59 GB Download
PowerMoE-3b.Q8_0.gguf GGUF 3.35 GB Download

Model Details Live

Model Slug
richarderkhov/ibm_-_powermoe-3b-gguf
Author
RichardErkhov
Pipeline Task
Library
Created
2024-10-21
Last Modified
2024-10-21
Gated
No
Private
No
HF SHA
51ec1d031a4245a2fc087aea92702384604dd228
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "",
    "summary": "model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device) model.eval() # change input text as desired prompt = \"Write a code to find the maximum value in a list of numbers.\" # tokenize the text input_tokens = tokenizer(prompt, return_tensors=\"pt\") # transfer tokenized inputs to the device for i in input_tokens: input_tokens[i] = input_tokens[i].to(device) # generate output tokens output = model.generate(**input_tokens, max_new_tokens=100) # decode output tokens into text output = tokenizer.batch_decode(output) # loop over the batch to print, in this example the batch size is 1 for i in output: print(i) ``` Additional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more quants, at much higher speed, than I would otherwise be able to.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nPowerMoE-3b - GGUF\n- Model creator: https://huggingface.co/ibm/\n- Original model: https://huggingface.co/ibm/PowerMoE-3b/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [PowerMoE-3b.Q2_K.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q2_K.gguf) | Q2_K | 1.18GB |\n| [PowerMoE-3b.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.IQ3_XS.gguf) | IQ3_XS | 1.32GB |\n| [PowerMoE-3b.IQ3_S.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.IQ3_S.gguf) | IQ3_S | 1.39GB |\n| [PowerMoE-3b.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q3_K_S.gguf) | Q3_K_S | 1.39GB |\n| [PowerMoE-3b.IQ3_M.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.IQ3_M.gguf) | IQ3_M | 1.41GB |\n| [PowerMoE-3b.Q3_K.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q3_K.gguf) | Q3_K | 1.53GB |\n| [PowerMoE-3b.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q3_K_M.gguf) | Q3_K_M | 1.53GB |\n| [PowerMoE-3b.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q3_K_L.gguf) | Q3_K_L | 1.65GB |\n| [PowerMoE-3b.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.IQ4_XS.gguf) | IQ4_XS | 1.72GB |\n| [PowerMoE-3b.Q4_0.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q4_0.gguf) | Q4_0 | 1.79GB |\n| [PowerMoE-3b.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.IQ4_NL.gguf) | IQ4_NL | 1.81GB |\n| [PowerMoE-3b.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q4_K_S.gguf) | Q4_K_S | 1.81GB |\n| [PowerMoE-3b.Q4_K.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q4_K.gguf) | Q4_K | 1.92GB |\n| [PowerMoE-3b.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q4_K_M.gguf) | Q4_K_M | 1.92GB |\n| [PowerMoE-3b.Q4_1.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q4_1.gguf) | Q4_1 | 1.99GB |\n| [PowerMoE-3b.Q5_0.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q5_0.gguf) | Q5_0 | 2.18GB |\n| [PowerMoE-3b.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q5_K_S.gguf) | Q5_K_S | 2.18GB |\n| [PowerMoE-3b.Q5_K.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q5_K.gguf) | Q5_K | 2.24GB |\n| [PowerMoE-3b.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q5_K_M.gguf) | Q5_K_M | 2.24GB |\n| [PowerMoE-3b.Q5_1.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q5_1.gguf) | Q5_1 | 2.37GB |\n| [PowerMoE-3b.Q6_K.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q6_K.gguf) | Q6_K | 2.59GB |\n| [PowerMoE-3b.Q8_0.gguf](https://huggingface.co/RichardErkhov/ibm_-_PowerMoE-3b-gguf/blob/main/PowerMoE-3b.Q8_0.gguf) | Q8_0 | 3.35GB |\n\n\n\n\nOriginal model description:\n---\npipeline_tag: text-generation\ninference: false\nlicense: apache-2.0\nlibrary_name: transformers\nmodel-index:\n- name: ibm/PowerMoE-3b\n  results:\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: ARC\n    metrics:\n    - name: accuracy-norm\n      type: accuracy-norm\n      value: 58.1\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: BoolQ\n    metrics:\n    - name: accuracy\n      type: accuracy\n      value: 65.0\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: Hellaswag\n    metrics:\n    - name: accuracy-norm\n      type: accuracy-norm\n      value: 71.5\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: OpenBookQA\n    metrics:\n    - name: accuracy-norm\n      type: accuracy-norm\n      value: 41.0\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: PIQA\n    metrics:\n    - name: accuracy-norm\n      type: accuracy-norm\n      value: 79.1\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: Winogrande\n    metrics:\n    - name: accuracy-norm\n      type: accuracy-norm\n      value: 65.0\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: MMLU (5 shot)\n    metrics:\n    - name: accuracy\n      type: accuracy\n      value: 42.8\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: GSM8k (5 shot)\n    metrics:\n    - name: accuracy\n      type: accuracy\n      value: 25.9\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: lm-eval-harness\n      name: math (4 shot)\n    metrics:\n    - name: accuracy\n      type: accuracy\n      value: 14.8\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: bigcode-eval\n      name: humaneval\n    metrics:\n    - name: pass@1\n      type: pass@1\n      value: 20.1\n      verified: false\n  - task:\n      type: text-generation\n    dataset:\n      type: bigcode-eval\n      name: MBPP\n    metrics:\n    - name: pass@1\n      type: pass@1\n      value: 32.4\n      verified: false\n---\n\n## Model Summary\nPowerMoE-3B is a 3B sparse Mixture-of-Experts (sMoE) language model trained with the Power learning rate scheduler. It sparsely activates 800M parameters for each token. It is trained on a mix of open-source and proprietary datasets. PowerMoE-3B has shown promising results compared to other dense models with 2x activate parameters across various benchmarks, including natural language multi-choices, code generation, and math reasoning.\nPaper: https://arxiv.org/abs/2408.13359\n\n## Usage\nNote: Requires installing HF transformers from source.\n\n### Generation\nThis is a simple example of how to use **PowerMoE-3b** model.\n\n```python\nimport torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\ndevice = \"cuda\" # or \"cpu\"\nmodel_path = \"ibm/PowerMoE-3b\"\ntokenizer = AutoTokenizer.from_pretrained(model_path)\n# drop device_map if running on CPU\nmodel = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)\nmodel.eval()\n# change input text as desired\nprompt = \"Write a code to find the maximum value in a list of numbers.\"\n# tokenize the text\ninput_tokens = tokenizer(prompt, return_tensors=\"pt\")\n# transfer tokenized inputs to the device\nfor i in input_tokens:\n    input_tokens[i] = input_tokens[i].to(device)\n# generate output tokens\noutput = model.generate(**input_tokens, max_new_tokens=100)\n# decode output tokens into text\noutput = tokenizer.batch_decode(output)\n# loop over the batch to print, in this example the batch size is 1\nfor i in output:\n    print(i)\n```\n\n\nAdditional thanks to @nicoboss for giving me access to his private supercomputer, enabling me to provide many more quants, at much higher speed, than I would otherwise be able to.",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "arxiv:2408.13359",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 0,
  "downloads": 95,
  "gated": false,
  "private": false,
  "last_modified": "2024-10-21T04:23:17.000Z",
  "created_at": "2024-10-21T03:49:41.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "6715cf557ebc9ce65c1153e8",
  "id": "RichardErkhov/ibm_-_PowerMoE-3b-gguf",
  "modelId": "RichardErkhov/ibm_-_PowerMoE-3b-gguf",
  "sha": "51ec1d031a4245a2fc087aea92702384604dd228",
  "createdAt": "2024-10-21T03:49:41.000Z",
  "lastModified": "2024-10-21T04:23:17.000Z",
  "author": "RichardErkhov",
  "downloads": 95,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}