Model Intelligence Sheet

quantfactory/amd-llama-350m-upgraded-gguf overview

This is quantized version of reflex-ai/AMD-Llama-350M-Upgraded created using llama.cpp # Original Model Card # AMD Llama 350M Upgraded

transformersggufcausal-lmllamareflex-aienlicense:apache-2.0endpoints_compatibleregion:us

quantfactory/amd-llama-350m-upgraded-gguf visual

Downloads

248

Likes

Pipeline

—

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

14 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
AMD-Llama-350M-Upgraded.Q2_K.gguf	GGUF	Q2_K	126.41 MB	Download
AMD-Llama-350M-Upgraded.Q3_K_L.gguf	GGUF	Q3_K_L	171.68 MB	Download
AMD-Llama-350M-Upgraded.Q3_K_M.gguf	GGUF	Q3_K_M	159.82 MB	Download
AMD-Llama-350M-Upgraded.Q3_K_S.gguf	GGUF	Q3_K_S	146.16 MB	Download
AMD-Llama-350M-Upgraded.Q4_0.gguf	GGUF	—	185.13 MB	Download
AMD-Llama-350M-Upgraded.Q4_1.gguf	GGUF	—	203.47 MB	Download
AMD-Llama-350M-Upgraded.Q4_K_M.gguf	GGUF	Q4_K_M	196.15 MB	Download
AMD-Llama-350M-Upgraded.Q4_K_S.gguf	GGUF	Q4_K_S	186.54 MB	Download
AMD-Llama-350M-Upgraded.Q5_0.gguf	GGUF	—	221.81 MB	Download
AMD-Llama-350M-Upgraded.Q5_1.gguf	GGUF	—	240.15 MB	Download
AMD-Llama-350M-Upgraded.Q5_K_M.gguf	GGUF	Q5_K_M	227.49 MB	Download
AMD-Llama-350M-Upgraded.Q5_K_S.gguf	GGUF	Q5_K_S	221.81 MB	Download
AMD-Llama-350M-Upgraded.Q6_K.gguf	GGUF	Q6_K	260.78 MB	Download
AMD-Llama-350M-Upgraded.Q8_0.gguf	GGUF	—	337.53 MB	Download

Model Details Live

Model Slug

quantfactory/amd-llama-350m-upgraded-gguf

Author

QuantFactory

Pipeline Task

—

Library

transformers

Created

2024-12-10

Last Modified

2024-12-10

Gated

Private

HF SHA

0aeba8202556ece8486b9e6bf9fb8a2d2928952c

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "language": "en",
    "license": "apache-2.0",
    "tags": [
      "causal-lm",
      "transformers",
      "llama",
      "reflex-ai"
    ],
    "frontmatter": {},
    "hero_image_url": "https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ",
    "summary": "This is quantized version of reflex-ai/AMD-Llama-350M-Upgraded created using llama.cpp # Original Model Card # AMD Llama 350M Upgraded",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "\n---\n\nlanguage: en\nlicense: apache-2.0\ntags:\n  - causal-lm\n  - transformers\n  - llama\n  - reflex-ai\n\n---\n\n[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)\n\n\n# QuantFactory/AMD-Llama-350M-Upgraded-GGUF\nThis is quantized version of [reflex-ai/AMD-Llama-350M-Upgraded](https://huggingface.co/reflex-ai/AMD-Llama-350M-Upgraded) created using llama.cpp\n\n# Original Model Card\n\n\n# AMD Llama 350M Upgraded\n\n## Model Description\n\nThe **AMD Llama 350M Upgraded** is a transformer-based causal language model built on the Llama architecture, designed to generate human-like text. This model has been upgraded from the original AMD Llama 135M model to provide enhanced performance with an increased parameter count of 332 million. It is suitable for various natural language processing tasks, including text generation, completion, and conversational applications.\n\n## Model Details\n\n- **Model Type**: Causal Language Model\n- **Architecture**: Llama\n- **Number of Parameters**: 332 million\n- **Input Size**: Variable-length input sequences\n- **Output Size**: Variable-length output sequences\n\n## Usage\n\nTo use the AMD Llama 350M Upgraded model, you can utilize the `transformers` library. Here’s a sample code snippet to get started:\n\n```python\nimport torch\nfrom transformers import LlamaForCausalLM, LlamaTokenizer\n\n# Load the tokenizer and model\nmodel_name = \"reflex-ai/AMD-Llama-350M-Upgraded\"\ntokenizer = LlamaTokenizer.from_pretrained(model_name)\nmodel = LlamaForCausalLM.from_pretrained(model_name)\n\n# Set the model to evaluation mode\nmodel.eval()\n\n# Function to generate text\ndef generate_text(prompt, max_length=50):\n    inputs = tokenizer.encode(prompt, return_tensors='pt', padding=True, truncation=True)\n    attention_mask = (inputs != tokenizer.pad_token_id).long()\n\n    if torch.cuda.is_available():\n        inputs = inputs.to('cuda')\n        attention_mask = attention_mask.to('cuda')\n\n    with torch.no_grad():\n        outputs = model.generate(inputs, attention_mask=attention_mask, max_length=max_length, num_return_sequences=1)\n    \n    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)\n    return generated_text\n\n# Example usage\nprompt = \"Once upon a time in a land far away,\"\ngenerated_output = generate_text(prompt, max_length=100)\nprint(generated_output)\n\n",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "causal-lm",
    "llama",
    "reflex-ai",
    "en",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us"
  ],
  "likes": 2,
  "downloads": 248,
  "gated": false,
  "private": false,
  "last_modified": "2024-12-10T15:50:56.000Z",
  "created_at": "2024-12-10T15:48:52.000Z",
  "pipeline_tag": "",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "675862e4fffc71684da86276",
  "id": "QuantFactory/AMD-Llama-350M-Upgraded-GGUF",
  "modelId": "QuantFactory/AMD-Llama-350M-Upgraded-GGUF",
  "sha": "0aeba8202556ece8486b9e6bf9fb8a2d2928952c",
  "createdAt": "2024-12-10T15:48:52.000Z",
  "lastModified": "2024-12-10T15:50:56.000Z",
  "author": "QuantFactory",
  "downloads": 248,
  "likes": 2,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "transformers",
  "siblings_count": 16
}