GraySoft
Projects Models About FAQ Contact Download guIDE →
Model Intelligence Sheet

quantfactory/amd-llama-350m-upgraded-gguf overview

This is quantized version of reflex-ai/AMD-Llama-350M-Upgraded created using llama.cpp # Original Model Card # AMD Llama 350M Upgraded

transformersggufcausal-lmllamareflex-aienlicense:apache-2.0endpoints_compatibleregion:us
quantfactory/amd-llama-350m-upgraded-gguf visual
Downloads
248
Likes
2
Pipeline
Library
transformers
Visibility
Public
Access
Open

Repository Files & Downloads

14 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
AMD-Llama-350M-Upgraded.Q2_K.gguf GGUF Q2_K 126.41 MB Download
AMD-Llama-350M-Upgraded.Q3_K_L.gguf GGUF Q3_K_L 171.68 MB Download
AMD-Llama-350M-Upgraded.Q3_K_M.gguf GGUF Q3_K_M 159.82 MB Download
AMD-Llama-350M-Upgraded.Q3_K_S.gguf GGUF Q3_K_S 146.16 MB Download
AMD-Llama-350M-Upgraded.Q4_0.gguf GGUF 185.13 MB Download
AMD-Llama-350M-Upgraded.Q4_1.gguf GGUF 203.47 MB Download
AMD-Llama-350M-Upgraded.Q4_K_M.gguf GGUF Q4_K_M 196.15 MB Download
AMD-Llama-350M-Upgraded.Q4_K_S.gguf GGUF Q4_K_S 186.54 MB Download
AMD-Llama-350M-Upgraded.Q5_0.gguf GGUF 221.81 MB Download
AMD-Llama-350M-Upgraded.Q5_1.gguf GGUF 240.15 MB Download
AMD-Llama-350M-Upgraded.Q5_K_M.gguf GGUF Q5_K_M 227.49 MB Download
AMD-Llama-350M-Upgraded.Q5_K_S.gguf GGUF Q5_K_S 221.81 MB Download
AMD-Llama-350M-Upgraded.Q6_K.gguf GGUF Q6_K 260.78 MB Download
AMD-Llama-350M-Upgraded.Q8_0.gguf GGUF 337.53 MB Download

Model Details Live

Model Slug
quantfactory/amd-llama-350m-upgraded-gguf
Author
QuantFactory
Pipeline Task
Library
transformers
Created
2024-12-10
Last Modified
2024-12-10
Gated
No
Private
No
HF SHA
0aeba8202556ece8486b9e6bf9fb8a2d2928952c
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "language": "en",
    "license": "apache-2.0",
    "tags": [
      "causal-lm",
      "transformers",
      "llama",
      "reflex-ai"
    ],
    "frontmatter": {},
    "hero_image_url": "https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ",
    "summary": "This is quantized version of reflex-ai/AMD-Llama-350M-Upgraded created using llama.cpp # Original Model Card # AMD Llama 350M Upgraded",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "\n---\n\nlanguage: en\nlicense: apache-2.0\ntags:\n  - causal-lm\n  - transformers\n  - llama\n  - reflex-ai\n\n---\n\n[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)\n\n\n# QuantFactory/AMD-Llama-350M-Upgraded-GGUF\nThis is quantized version of [reflex-ai/AMD-Llama-350M-Upgraded](https://huggingface.co/reflex-ai/AMD-Llama-350M-Upgraded) created using llama.cpp\n\n# Original Model Card\n\n\n# AMD Llama 350M Upgraded\n\n## Model Description\n\nThe **AMD Llama 350M Upgraded** is a transformer-based causal language model built on the Llama architecture, designed to generate human-like text. This model has been upgraded from the original AMD Llama 135M model to provide enhanced performance with an increased parameter count of 332 million. It is suitable for various natural language processing tasks, including text generation, completion, and conversational applications.\n\n## Model Details\n\n- **Model Type**: Causal Language Model\n- **Architecture**: Llama\n- **Number of Parameters**: 332 million\n- **Input Size**: Variable-length input sequences\n- **Output Size**: Variable-length output sequences\n\n## Usage\n\nTo use the AMD Llama 350M Upgraded model, you can utilize the `transformers` library. Here’s a sample code snippet to get started:\n\n```python\nimport torch\nfrom transformers import LlamaForCausalLM, LlamaTokenizer\n\n# Load the tokenizer and model\nmodel_name = \"reflex-ai/AMD-Llama-350M-Upgraded\"\ntokenizer = LlamaTokenizer.from_pretrained(model_name)\nmodel = LlamaForCausalLM.from_pretrained(model_name)\n\n# Set the model to evaluation mode\nmodel.eval()\n\n# Function to generate text\ndef generate_text(prompt, max_length=50):\n    inputs = tokenizer.encode(prompt, return_tensors='pt', padding=True, truncation=True)\n    attention_mask = (inputs != tokenizer.pad_token_id).long()\n\n    if torch.cuda.is_available():\n        inputs = inputs.to('cuda')\n        attention_mask = attention_mask.to('cuda')\n\n    with torch.no_grad():\n        outputs = model.generate(inputs, attention_mask=attention_mask, max_length=max_length, num_return_sequences=1)\n    \n    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)\n    return generated_text\n\n# Example usage\nprompt = \"Once upon a time in a land far away,\"\ngenerated_output = generate_text(prompt, max_length=100)\nprint(generated_output)\n\n",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "causal-lm",
    "llama",
    "reflex-ai",
    "en",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 2,
  "downloads": 248,
  "gated": false,
  "private": false,
  "last_modified": "2024-12-10T15:50:56.000Z",
  "created_at": "2024-12-10T15:48:52.000Z",
  "pipeline_tag": "",
  "library_name": "transformers"
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "675862e4fffc71684da86276",
  "id": "QuantFactory/AMD-Llama-350M-Upgraded-GGUF",
  "modelId": "QuantFactory/AMD-Llama-350M-Upgraded-GGUF",
  "sha": "0aeba8202556ece8486b9e6bf9fb8a2d2928952c",
  "createdAt": "2024-12-10T15:48:52.000Z",
  "lastModified": "2024-12-10T15:50:56.000Z",
  "author": "QuantFactory",
  "downloads": 248,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "transformers",
  "siblings_count": 16
}