Model Intelligence Sheet

quantfactory/qwen2.5-math-1.5b-instruct-gguf overview

This is quantized version of Qwen/Qwen2.5-Math-1.5B-Instruct created using llama.cpp # Original Model Card # Qwen2.5-Math-1.5B-Instruct 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. >

transformersggufchattext-generationenarxiv:2407.10671base_model:Qwen/Qwen2.5-Math-1.5Bbase_model:quantized:Qwen/Qwen2.5-Math-1.5Blicense:apache-2.0endpoints_compatibleregion:usconversational

quantfactory/qwen2.5-math-1.5b-instruct-gguf visual

Downloads

159

Likes

Pipeline

text-generation

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

14 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Qwen2.5-Math-1.5B-Instruct.Q2_K.gguf	GGUF	Q2_K	644.97 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q3_K_L.gguf	GGUF	Q3_K_L	839.39 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q3_K_M.gguf	GGUF	Q3_K_M	786.00 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q3_K_S.gguf	GGUF	Q3_K_S	725.69 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q4_0.gguf	GGUF	—	891.64 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q4_1.gguf	GGUF	—	969.74 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q4_K_M.gguf	GGUF	Q4_K_M	940.37 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q4_K_S.gguf	GGUF	Q4_K_S	896.75 MB	Download
Qwen2.5-Math-1.5B-Instruct.Q5_0.gguf	GGUF	—	1.02 GB	Download
Qwen2.5-Math-1.5B-Instruct.Q5_1.gguf	GGUF	—	1.10 GB	Download
Qwen2.5-Math-1.5B-Instruct.Q5_K_M.gguf	GGUF	Q5_K_M	1.05 GB	Download
Qwen2.5-Math-1.5B-Instruct.Q5_K_S.gguf	GGUF	Q5_K_S	1.02 GB	Download
Qwen2.5-Math-1.5B-Instruct.Q6_K.gguf	GGUF	Q6_K	1.19 GB	Download
Qwen2.5-Math-1.5B-Instruct.Q8_0.gguf	GGUF	—	1.53 GB	Download

Model Details Live

Model Slug

quantfactory/qwen2.5-math-1.5b-instruct-gguf

Author

QuantFactory

Pipeline Task

text-generation

Library

transformers

Created

2024-09-19

Last Modified

2024-09-19

Gated

Private

HF SHA

2372926a510ee102cdfe86206379407ff3c85a56

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "Qwen/Qwen2.5-Math-1.5B",
    "language": [
      "en"
    ],
    "pipeline_tag": "text-generation",
    "tags": [
      "chat"
    ],
    "library_name": "transformers",
    "license": "apache-2.0",
    "license_link": "https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct/blob/main/LICENSE",
    "frontmatter": {},
    "hero_image_url": "https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ",
    "summary": "This is quantized version of Qwen/Qwen2.5-Math-1.5B-Instruct created using llama.cpp # Original Model Card # Qwen2.5-Math-1.5B-Instruct > [!Warning] >  >  > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. >  >",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "\n---\n\nbase_model: Qwen/Qwen2.5-Math-1.5B\nlanguage:\n- en\npipeline_tag: text-generation\ntags:\n- chat\nlibrary_name: transformers\nlicense: apache-2.0\nlicense_link: https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct/blob/main/LICENSE\n\n---\n\n[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)\n\n\n# QuantFactory/Qwen2.5-Math-1.5B-Instruct-GGUF\nThis is quantized version of [Qwen/Qwen2.5-Math-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct) created using llama.cpp\n\n# Original Model Card\n\n\n\n# Qwen2.5-Math-1.5B-Instruct\n\n> [!Warning]\n> <div align=\"center\">\n> <b>\n> 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks.\n> </b>\n> </div>\n\n## Introduction\n\nIn August 2024, we released the first series of mathematical LLMs - [Qwen2-Math](https://qwenlm.github.io/blog/qwen2-math/) - of our Qwen family. A month later, we have upgraded it and open-sourced **Qwen2.5-Math** series, including base models **Qwen2.5-Math-1.5B/7B/72B**, instruction-tuned models **Qwen2.5-Math-1.5B/7B/72B-Instruct**, and mathematical reward model **Qwen2.5-Math-RM-72B**. \n                                                                              \nUnlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT. \n\n![](http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/qwen2.5-math-pipeline.jpeg)\n\nWhile CoT plays a vital role in enhancing the reasoning capabilities of LLMs, it faces challenges in achieving computational accuracy and handling complex mathematical or algorithmic reasoning tasks, such as finding the roots of a quadratic equation or computing the eigenvalues of a matrix. TIR can further improve the model's proficiency in precise computation, symbolic manipulation, and algorithmic manipulation. Qwen2.5-Math-1.5B/7B/72B-Instruct achieve 79.7, 85.3, and 87.8 respectively on the MATH benchmark using TIR. \n\n## Model Details\n\n\nFor more details, please refer to our [blog post](https://qwenlm.github.io/blog/qwen2.5-math/) and [GitHub repo](https://github.com/QwenLM/Qwen2.5-Math).\n\n\n## Requirements\n* `transformers>=4.37.0` for Qwen2.5-Math models. The latest version is recommended.\n\n> [!Warning]\n> <div align=\"center\">\n> <b>\n> 🚨 This is a must because <code>transformers</code> integrated Qwen2 codes since <code>4.37.0</code>.\n> </b>\n> </div>\n\nFor requirements on GPU memory and the respective throughput, see similar results of Qwen2 [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).\n\n## Quick Start\n\n> [!Important]\n>\n> **Qwen2.5-Math-1.5B-Instruct** is an instruction model for chatting;\n>\n> **Qwen2.5-Math-1.5B** is a base model typically used for completion and few-shot inference, serving as a better starting point for fine-tuning.\n> \n\n### 🤗 Hugging Face Transformers\n\nQwen2.5-Math can be deployed and infered in the same way as [Qwen2.5](https://github.com/QwenLM/Qwen2.5). Here we show a code snippet to show you how to use the chat model with `transformers`:\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_name = \"Qwen/Qwen2.5-Math-1.5B-Instruct\"\ndevice = \"cuda\" # the device to load the model onto\n\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_name,\n    torch_dtype=\"auto\",\n    device_map=\"auto\"\n)\ntokenizer = AutoTokenizer.from_pretrained(model_name)\n\nprompt = \"Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$.\"\n\n# CoT\nmessages = [\n    {\"role\": \"system\", \"content\": \"Please reason step by step, and put your final answer within \\\\boxed{}.\"},\n    {\"role\": \"user\", \"content\": prompt}\n]\n\n# TIR\nmessages = [\n    {\"role\": \"system\", \"content\": \"Please integrate natural language reasoning with programs to solve the problem above, and put your final answer within \\\\boxed{}.\"},\n    {\"role\": \"user\", \"content\": prompt}\n]\n\ntext = tokenizer.apply_chat_template(\n    messages,\n    tokenize=False,\n    add_generation_prompt=True\n)\nmodel_inputs = tokenizer([text], return_tensors=\"pt\").to(device)\n\ngenerated_ids = model.generate(\n    **model_inputs,\n    max_new_tokens=512\n)\ngenerated_ids = [\n    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)\n]\n\nresponse = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]\n```\n\n## Citation\n\nIf you find our work helpful, feel free to give us a citation.\n\n```\n@article{yang2024qwen2,\n  title={Qwen2 technical report},\n  author={Yang, An and Yang, Baosong and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Zhou, Chang and Li, Chengpeng and Li, Chengyuan and Liu, Dayiheng and Huang, Fei and others},\n  journal={arXiv preprint arXiv:2407.10671},\n  year={2024}\n}\n```\n",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "chat",
    "text-generation",
    "en",
    "arxiv:2407.10671",
    "base_model:Qwen/Qwen2.5-Math-1.5B",
    "base_model:quantized:Qwen/Qwen2.5-Math-1.5B",
    "license:apache-2.0",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 1,
  "downloads": 159,
  "gated": false,
  "private": false,
  "last_modified": "2024-09-19T03:16:03.000Z",
  "created_at": "2024-09-19T03:07:38.000Z",
  "pipeline_tag": "text-generation",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "66eb957a3a3377f7a479035a",
  "id": "QuantFactory/Qwen2.5-Math-1.5B-Instruct-GGUF",
  "modelId": "QuantFactory/Qwen2.5-Math-1.5B-Instruct-GGUF",
  "sha": "2372926a510ee102cdfe86206379407ff3c85a56",
  "createdAt": "2024-09-19T03:07:38.000Z",
  "lastModified": "2024-09-19T03:16:03.000Z",
  "author": "QuantFactory",
  "downloads": 159,
  "likes": 1,
  "gated": false,
  "private": false,
  "pipeline_tag": "text-generation",
  "library_name": "transformers",
  "siblings_count": 16
}