Model Intelligence Sheet
nectec/pathumma-thaillm-8b-think-3.0.0-gguf overview
Post-trained Thai Large Language Model built upon the foundation model from the Thai national initiative ThaiLLM. This release applies multi-stage Supervised Fine-Tuning (SFT) to enhance: ---
Downloads
104
Likes
0
Pipeline
—
Library
—
Visibility
Public
Access
Open
Repository Files & Downloads
5 files detected
Direct downloads for all repository files
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| pathumma-thaillm-8b-think-3-Q4_K_M.gguf | GGUF | Q4_K_M | 4.68 GB | Download |
| pathumma-thaillm-8b-think-3-Q5_0.gguf | GGUF | — | 5.33 GB | Download |
| pathumma-thaillm-8b-think-3-Q5_K_M.gguf | GGUF | Q5_K_M | 5.45 GB | Download |
| pathumma-thaillm-8b-think-3-Q6_K.gguf | GGUF | Q6_K | 6.26 GB | Download |
| pathumma-thaillm-8b-think-3-Q8_0.gguf | GGUF | — | 8.11 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"license": "apache-2.0",
"frontmatter": {
"license": "apache-2.0"
},
"hero_image_url": "pathumma-thaillm-300.png",
"summary": "Post-trained Thai Large Language Model built upon the foundation model from the Thai national initiative **ThaiLLM**. This release applies multi-stage Supervised Fine-Tuning (SFT) to enhance: ---",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "---\nlicense: apache-2.0\n---\n\n\n\n# Pathumma-ThaiLLM-Think-3.0.0\n\nPost-trained Thai Large Language Model built upon the foundation model from the Thai national initiative [**ThaiLLM**](https://www.thaillm.or.th).\n\nThis release applies multi-stage Supervised Fine-Tuning (SFT) to enhance:\n\n- Instruction following\n- Structured tool / function calling\n- Mathematical and coding competence\n- Multi-step analytical capability\n- Thai–English bilingual robustness\n\n---\n\n## Training Strategy\n\nPost-training is organized into **two stages**:\n\n- **Stage 1:** Instruction & Tool-Calling Alignment \n- **Stage 2:** Reasoning Specialization \n\nFor selected corpora, only curated subsets were used to maintain domain balance.\n\n---\n\n## Stage 1: Instruction & Tool-Calling Alignment\n\nFocus areas:\n\n- Instruction compliance \n- Structured tool-call formatting \n- General Thai task robustness \n- STEM-oriented instruction alignment \n\n### Datasets\n\n| Dataset | Training Subset Size | Full Dataset Size | Domain | License |\n|----------|---------------------|------------------|----------|----------|\n| beyoru/ToolCall_synthetic_qwen3 | 60,000 | 60,000 | Tool | Apache-2.0 |\n| airesearch/WangchanX-FLAN-v6 | 2,000,000 | 13,619,450 | General | Mixed |\n| nvidia/OpenMathInstruct-2 | 1,000,000 | 14,000,000 | STEM | CC-BY-4.0 |\n| jdaddyalbs/playwright-mcp-toolcalling | 1,750 | 1,750 | Tool | MIT |\n| BitAgent/tool_calling | 551,000 | 551,000 | Tool | MIT |\n\n<br>\n\n## Stage 2: Reasoning Specialization\n\nFocus areas:\n\n- Multi-step mathematical analysis \n- Code understanding and synthesis \n- Structured analytical responses \n- Tool-calling with explicit reasoning traces \n- Thai reasoning distillation \n\n### Datasets\n\n| Dataset | Training Subset Size | Full Dataset Size | Domain | License |\n|----------|---------------------|------------------|----------|----------|\n| nvidia/OpenMathReasoning | 500,000 | 4,920,000 | STEM | CC-BY-4.0 |\n| nvidia/OpenCodeReasoning | 585,000 | 585,000 | Coding | CC-BY-4.0 |\n| natolambert/GeneralThought-430K-filtered | 337,579 | 337,579 | General | MIT |\n| Jofthomas/hermes-function-calling-thinking-V1 | 3,570 | 3,570 | Tool | MIT |\n| open-thoughts/OpenThoughts3-1.2M | 1,200,000 | 1,200,000 | STEM | Apache-2.0 |\n| scb10x/typhoon-r1-sft-data | 23,851 | 23,851 | General | Custom |\n| iapp/Thai-R1-Distill-SFT | 10,000 | 10,000 | General | Custom |\n| nvidia/Nemotron-Post-Training-Dataset-v1 | 310,000 | 310,000 | Tool | CC-BY-4.0 |\n\n---\n\n> **Note:** For selected datasets, curated subsets were employed to ensure balanced domain representation.\n\n---\n\n## Methodology\n\n- Base model: ThaiLLM foundation model \n- Training objective: Supervised Fine-Tuning (SFT) \n- Two-stage curriculum design \n- Domain-balanced optimization \n- Tool-call schema alignment \n- Thai reasoning distillation\n\n---\n\n## Compute Infrastructure\n\nTraining was conducted on the LANTA high-performance computing cluster, utilizing 16 nodes (64×A100 40GB GPUs) for distributed large-scale post-training.\n## Capabilities\n\n- Thai instruction compliance \n- Structured JSON tool invocation \n- Mathematical problem solving \n- Code generation and analysis \n- Multi-step analytical tasks \n- Thai–English bilingual support\n\n---\n\n## Limitations\n\n- May hallucinate if tool schema is incomplete \n- Performance on long analytical chains may degrade without retrieval \n- Domain coverage depends on included corpora \n\n---\n<br>\n\n# Quickstart\nThe code of Qwen3 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.\nWith `transformers<4.51.0`, you will encounter the following error:\n```\nKeyError: 'qwen3'\n```\nThe following contains a code snippet illustrating how to use the model generate content based on given inputs. \n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nmodel_name = \"nectec/pathumma-thaillm-8b-think-3.0.0\"\n# load the tokenizer and the model\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(\n model_name,\n torch_dtype=\"auto\",\n device_map=\"auto\"\n)\n# prepare the model input\nprompt = \"ทำไมวงกลมถึงมี 360 องศา\"\nmessages = [\n {\"role\": \"user\", \"content\": prompt}\n]\ntext = tokenizer.apply_chat_template(\n messages,\n tokenize=False,\n add_generation_prompt=True,\n)\nmodel_inputs = tokenizer([text], return_tensors=\"pt\").to(model.device)\n# conduct text completion\ngenerated_ids = model.generate(\n **model_inputs,\n max_new_tokens=32768\n)\noutput_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() \n# parsing thinking content\ntry:\n # rindex finding 151668 (</think>)\n index = len(output_ids) - output_ids[::-1].index(151668)\nexcept ValueError:\n index = 0\nthinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip(\"\\n\")\ncontent = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip(\"\\n\")\nprint(\"thinking content:\", thinking_content) # no opening <think> tag\nprint(\"content:\", content)\n```\nFor deployment, you can use `vllm>=0.8.5` to create an OpenAI-compatible API endpoint:\n\n```shell\nvllm serve nectec/pathumma-thaillm-8b-think-3.0.0 \\\n --enforce-eager \\\n --no-enable-chunked-prefill \\\n --tool-call-parser hermes\n```\n\nFor local use, applications such as Ollama, LMStudio, and llama.cpp have also supported.\n\n## About the Project\n\nPathumma-ThaiLLM-Think-3.0.0 is part of ongoing research toward sovereign Thai large language models optimized for analytical and tool-augmented intelligence.\n\n# Contributor Contract\n**LLM Team** \n<br>\nPiyawat Chuangkrud (piyawat@it.kmitl.ac.th)<br>\nChanon Utupon (s6401001620165@email.kmutnb.ac.th)<br>\nJessada Pranee (jessada.pran@kmutt.ac.th)<br>\nArnon Saeoung (anon.saeoueng@gmail.com)<br>\nChaianun Damrongrat (chaianun.damrongrat@nectec.or.th)<br>\nSarawoot Kongyoung (sarawoot.kongyoung@nectec.or.th)\n",
"related_quantizations": []
},
"tags": [
"gguf",
"license:apache-2.0",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 0,
"downloads": 104,
"gated": false,
"private": false,
"last_modified": "2026-02-24T02:25:21.000Z",
"created_at": "2026-02-23T08:54:38.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "699c15cec055587af256337a",
"id": "nectec/pathumma-thaillm-8b-think-3.0.0-GGUF",
"modelId": "nectec/pathumma-thaillm-8b-think-3.0.0-GGUF",
"sha": "058f6008917180b029ec4e1e9c1bc68a0e0c53da",
"createdAt": "2026-02-23T08:54:38.000Z",
"lastModified": "2026-02-24T02:25:21.000Z",
"author": "nectec",
"downloads": 104,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 9
}